Pattern matching with Unicode (5.6.1)

Hello most excellent Unicode list,

I'm having a bit of a problem getting Unicode pattern matching to do
what I would like it to. My code somewhat resembles:

 sub parse_doc {
   my $file = shift;
   my $fh = do { no warnings; local *FH };
   open $fh,'<',$file or die "couldn't read [$file]: $!\n";
 
   my $contents = '';
   { local $/ = undef;
     $contents = <$fh>; }
   close $fh;

   # this is where I'm getting stuck
   my @contents = split "\n\n",$contents;
   print '['.int(@contents)."]\n";
 }

I've (sort of) made it work by doing:

 # strip BOM and trailing nulls and carriage returns
 s/^..// if $. == 1 and s/\0//g;
 s/[\0\r]//g;

But I'm sure there must be a more elegant way to do this. Honestly, I'm
not even sure where to start. Any ideas?

Thanks a bunch,

 -dave

<Prev in Thread]	Current Thread	[Next in Thread>
Pattern matching with Unicode (5.6.1), David Gray <= Re: Pattern matching with Unicode (5.6.1), Autrijus Tang RE: Pattern matching with Unicode (5.6.1), David Gray Re: Pattern matching with Unicode (5.6.1), Nicholas Clark RE: Pattern matching with Unicode (5.6.1), David Gray

Previous by Date:	perl, unicode and databases (mysql), Merijn van den Kroonenberg
Next by Date:	Re: Pattern matching with Unicode (5.6.1), Autrijus Tang
Previous by Thread:	perl, unicode and databases (mysql), Merijn van den Kroonenberg
Next by Thread:	Re: Pattern matching with Unicode (5.6.1), Autrijus Tang
Indexes:	[Date] [Thread] [Top] [All Lists]