Invalid Uicode characters

Dear PERLists,

I am running Perl 5.8. and trying to filter out some invalid Unicode characters 
from Unicoded texts of some South Asian languages. There are 28 such characters 
in my data (all control characters):

0x1, 0x10, 0x11, 0x12, 0x13, 0x14, 0x15, 0x16, 0x17, 0x18, 0x19, 0x1B, 0x1C, 
0x1D, 0x1F, 0x1e, 0x2, 0x3, 0x4, 0x5, 0x6, 0x7, 0x8, 0xB, 0xC, 0xF, 0xFFFF, 0xe 

The data is coded as utf-16 and I want to keep it this way when the invalid 
characters are removed. Is there an easy way to do this with Perl while keeping 
the textual quality intact? Any advice is welcome. Thanks.

Best,

Richard

<Prev in Thread]	Current Thread	[Next in Thread>
Invalid Uicode characters, z . xiao <= Re: Invalid Uicode characters, David Graff Re: Invalid Uicode characters, John Delacour Re: Invalid Uicode characters, Nick Ing-Simmons

Previous by Date:	Hangul decomposition and composition, SADAHIRO Tomoyuki
Next by Date:	Re: Invalid Uicode characters, David Graff
Previous by Thread:	Hangul decomposition and composition, SADAHIRO Tomoyuki
Next by Thread:	Re: Invalid Uicode characters, David Graff
Indexes:	[Date] [Thread] [Top] [All Lists]