Earl Hood wrote:
You do realize this greatly reduces the usability of namazu.
Of course, it might be so.
However, it is a reality in Namazu to our regret that only the
processing of 7bit ASCII text corresponds.
It is not possible only to correspond to it though the demand that
it wants to use 8bit character is understood.
In email, it is hard to control what charset you will get. With my
program, MHonArc, it does a fairly good job of "normalizing" the data,
It is fairly things of the past for MHonArc of Namazu.
It is before MHonArc outputs the code of the Unicode character entity
references.
I do not care, because I do not want to deal with Japanese text
with the data set in question. Namazu should NOT be doing
JP processing if the locale is not set to JP. Therefore, things
like multi-byte and wide characters are irrelevent.
There is a possibility that the text of multi-byte and wide characters
is input if the input is not definable.
And, there is a possibility for it to cause various adverse effects.
For instance, the index is destroyed.
(To begin with, 8bit character even doesn't test. )
If it is the decision of Namazu developers to exclude all character
entity references in data,
The corrected code is a sample to the last.
This correction is never reflected in stable-2-0 though it wrote
in previous mail.
However, it is not because it is possible to correspond to 8bit
character, and use it in 7bit ASCII text, please.
It is possible to say.
It is scheduled to correct it with Namazu 2.0.15 in the part that
passes Japanese processing excluding a Japanese environment of
mhonarc.pl.
--
=====================================================================
TADAMASA TERANISHI
http://www.asahi-net.or.jp/~yw3t-trns/index.htm
Key fingerprint = 474E 4D93 8E97 11F6 662D 8A42 17F5 52F4 10E7 D14E
_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en