ietf-822
[Top] [All Lists]

Re: rather than argue and bicker about who said what...

2003-01-18 10:23:34

D. J. Bernstein writes:
A byte-by-byte regexp matcher that doesn't know anything about UTF-8, such as an ancient version of the UNIX grep program, nevertheless does a perfect job of matching a UTF-8 regexp against a UTF-8 string.

Oh? Here's a grep that will find all four-letter sequences whose first two characters are 'ar' and last is 't'

   grep -i ar.t

...except those encoded in UTF-8 in which the third letter isn't also a US-ASCII letter.

--Arnt