procmail
[Top] [All Lists]

check out Knuth Re: Garbage vs Valid

2003-02-02 13:26:05
Bart Schaefer wrote:
On Sun, 2 Feb 2003, Professional Software Engineering wrote:

I suspect that "th"  and "rh" should also probably be added to the
"exclude me as a consonant" tests.

I think you're headed down a rathole here.  Consider words such as
strings, school (and Schaefer :-), inkling, and thoughtful.
^^^ ^^^  ^^^                        ^^^             ^^^^

If you're going to go down this road, some 20 odd years ago Knuth (one of
the chief TeX guys) published a hyphenation algorithm which you would find
extremely enlightening for its list of vowels, prefixes and suffixes, and
its elegantly simple yet charmingly brutish approach to the problem. I
suspect that running it through some variation of the hyphenation algorithm
and then looking for remaining syllables which exceed a certain run length
would come very close to the mark indeed.

(of course, since this is an international list, I should point out that
Knuth wrote his algorithm for dealing with english, and specifically
'mercun, and so its applicability to swedish or swahili may be limited)

--

Fred Morris
m3047(_at_)inwa(_dot_)net



_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>