Hello,
I tried Earl and Nokubi version of decode_numbered_entity.
None of them give me full solution.
My Hebrew emails (Windows-1255 and UTF-8) are partly index. Only the
Latin character are index.
When I use ant Earl modification to decode_numbered_entity
You may want to look at past discussion on this list. For example:
http://www.mhonarc.org/archive/cgi-bin/mesg.cgi?a=namazu-users-en&i=200506110226.j5B2QsG17211%40gator.earlhood.com
http://www.mhonarc.org/archive/cgi-bin/mesg.cgi?a=namazu-users-en&i=200506141642.j5EGgTC04218%40gator.earlhood.com
namazu.cgi serch result dispaly ????? instead of hebrew characters.
When I use Nohubi version of decode_numbered_entity. namazu.cgi serch
result dispaly the html file name instead of the mail title and "<<<
text/html: EXCLUDED >>>" instead of the 1st email line.
The MHonArc html file are fine, there is no problem to see the hebrew
characters.
Regards,
Rami
NOKUBI Takatsugu wrote:
The following is my plan to patch:
sub decode_numbered_entity ($) {
my ($num) = @_;
return ""
if ($num >= 0 && $num <= 31) || ($num >= 127 && $num <= 159) ||
($num >= 255 && !util::islang('ja'));
return ""
if $num >=127 && util::islang('ja');
sprintf ("%c",$num);
}
It wouldn't be affect for Japanese environment, and would adopt with
iso-8859-* characters.
I tested with it, and seemd good for the test suites.
_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en