perl-unicode

Warning messages for ill-formed data

2003-03-21 09:30:05
Hi-
  I'm looking for recommendations on how to warn about and record
problems
with ill-formed data.  Specifically, I'm reading in Big5 data from
multiple files
and converting it to Perl's utf8, and some of the Big5 double-byte
combinations 
are illegal (they appear to be user-defined special symbols).  I'd like
to be able 
to write code to handle lines with ill-formed data.  So, if I start with
code like:

open( IN_FH, '<:encoding(big5)', $inputFile ) or die...
while( $line = <IN_FH> ) {

or

open( IN_FH, $inputFile ) or die...
while( $line = decode('big5', <IN_FH> ) ) {

I'd like to add logic such as:

if( <$line has an error> )
  record the line number and file name
  record the error and the entire line
  map error to user-defined character (dependent on error) and process
the modified line

Could I get recommendations on how to do this?  Thanks-

Mark

PS  The STDERR "does not map to Unicode" warning on my version (5.8.0)
lists only 
the input file's line number; is it possible to add the input file name
as well?