On Sat, 14 Jun 2003 22:05:24 +0900
SADAHIRO Tomoyuki <bqw10602(_at_)nifty(_dot_)com> wrote:
I write a module that parses a character class
including grouping, intersection, union, and removal (subtraction),
according to Unicode Regular Expression (e.g. [A & B], [A-Z - XYZ])
and converts it into a regular expression in Perl.
For example, [A-Z & C-S & K-V] can be used as well as [K-S].
Well, this module cannot do such optimization
but utilizes perl regex syntax (?! ) and (?= ).
Updated to Version 0.02 with many bug fixes and doc fixes.
tarball
http://homepage1.nifty.com/nomenclator/perl/Unicode-Regex-Set-0.02.tar.gz
html-pod
http://homepage1.nifty.com/nomenclator/perl/Unicode-Regex-Set.html
see also
(UTR #18)
http://www.unicode.org/unicode/reports/tr18/
(ICU)
http://oss.software.ibm.com/icu/userguide/unicodeSet.html
Thank you,
SADAHIRO Tomoyuki