Re: rather than argue and bicker about who said what...
2003-01-18 10:23:34
D. J. Bernstein writes:
A byte-by-byte regexp matcher that doesn't know anything about UTF-8,
such as an ancient version of the UNIX grep program, nevertheless
does a perfect job of matching a UTF-8 regexp against a UTF-8 string.
Oh? Here's a grep that will find all four-letter sequences whose first
two characters are 'ar' and last is 't'
grep -i ar.t
...except those encoded in UTF-8 in which the third letter isn't also a
US-ASCII letter.
--Arnt
|
|