perl-unicode

processing utf-8 data

2002-04-24 09:36:09
Hi all,
 is it possible to smart process utf-8 encoded text data? I need to do
somenthing like:
- split text to words
- remove illegal characters for specified language
- remove control characters
- ...

Which module I need to use? There is a lot of modules for charset
conversion. I found Unicode::String to be usefull, but from latin*
encodings support only latin1.

How I can prevent false matching using regular expressions if working
with multibyte characters?

-- 
 best regards
  Ing. Roman Vasicek

 software developer
+----------------------------------------------------------------------------+
 PetaMem s.r.o., Drahobejlova 27/1019, 190 00 Praha 9 - Liben, Czech republic
 http://www.petamem.com/

<Prev in Thread] Current Thread [Next in Thread>
  • processing utf-8 data, Roman Vasicek <=