Hi all,
is it possible to smart process utf-8 encoded text data? I need to do
somenthing like:
- split text to words
- remove illegal characters for specified language
- remove control characters
- ...
Which module I need to use? There is a lot of modules for charset
conversion. I found Unicode::String to be usefull, but from latin*
encodings support only latin1.
How I can prevent false matching using regular expressions if working
with multibyte characters?
--
best regards
Ing. Roman Vasicek
software developer
+----------------------------------------------------------------------------+
PetaMem s.r.o., Drahobejlova 27/1019, 190 00 Praha 9 - Liben, Czech republic
http://www.petamem.com/