RE: Pattern matching with Unicode (5.6.1)

I'm having a bit of a problem getting Unicode pattern 
matching to do what I would like it to.


I guess my question wasn't entirely clear. I'm reading in the attatched
file and trying to split it on "\n\n".

When I'm looping over the file,

I've (sort of) made it work by doing:

 # strip BOM and trailing nulls and carriage returns
 s/^..// if $. == 1 and s/\0//g;
 s/[\0\r]//g;


The two-byte BOM has me thinking it's probably UTF-16. Is there an easy
way to tell what encoding a file uses?

But I'm sure there must be a more elegant way to do this. 
Honestly, I'm not even sure where to start. Any ideas?

Thanks a bunch,


 -dave

unicode.txt
Description: Text document

Current Thread

[Next in Thread>

Previous by Date:

Re: perl, unicode and databases (mysql), SADAHIRO Tomoyuki

Next by Date:

Re: Pattern matching with Unicode (5.6.1), Nicholas Clark

Previous by Thread:

Re: Pattern matching with Unicode (5.6.1), Autrijus Tang

Next by Thread:

Re: Pattern matching with Unicode (5.6.1), Nicholas Clark

Indexes:

[Date] [Thread] [Top] [All Lists]