mhonarc-commits
[Top] [All Lists]

CVS: mhonarc/MHonArc/lib/MHonArc CharEnt.pm,1.6,1.7

2002-11-28 12:53:31
Update of /cvsroot/mhonarc/mhonarc/MHonArc/lib/MHonArc
In directory subversions:/tmp/cvs-serv2877

Modified Files:
	CharEnt.pm 
Log Message:
* For unknown chinese >8-bit char, we replace convert to '?'.  Prevents
  possibility of sneaking in undesired character data with unknown
  sequences.


Index: CharEnt.pm
===================================================================
RCS file: /cvsroot/mhonarc/mhonarc/MHonArc/lib/MHonArc/CharEnt.pm,v
retrieving revision 1.6
retrieving revision 1.7
diff -C2 -r1.6 -r1.7
*** CharEnt.pm	28 Nov 2002 08:57:19 -0000	1.6
--- CharEnt.pm	28 Nov 2002 19:53:25 -0000	1.7
***************
*** 151,155 ****
      if ($charset eq 'utf-8') {
  	my($i, $n, $mask);
! 	# We do not do full compliant UTF-8 parsing.  Malformed sequences
  	# will end up being treated as individual octets replaced with the
  	# '?' sign.
--- 151,155 ----
      if ($charset eq 'utf-8') {
  	my($i, $n, $mask);
! 	# We do not do full compliant UTF-8 parsing: malformed sequences
  	# will end up being treated as individual octets replaced with the
  	# '?' sign.
***************
*** 209,213 ****
  		   : ($ASCIIMap{$char}
  		      ? join('', '&', $ASCIIMap{$char}, ';')
! 		      : pack(length($1)>1?'n':'C', $char))/gxe;
  
      } else {
--- 209,215 ----
  		   : ($ASCIIMap{$char}
  		      ? join('', '&', $ASCIIMap{$char}, ';')
! 		      : (length($1) > 1
! 			? '?'	    # unknown character
! 			: pack('C',$char)))/gxe;
  
      } else {

---------------------------------------------------------------------
To sign-off this list, send email to majordomo(_at_)mhonarc(_dot_)org with the
message text UNSUBSCRIBE MHONARC-DEV