What we can find is dependant on the character encoding setting used by
the browser doing the search.
The documents we built the index from are very likely to be using
several different Japanese character encodings. (ex. Shift_JIS, EUC-JP).
I've not used the perl modules, but I can tell you what I do on a site
that isn't native EUC.
For indexing an English UTF8 site I use:
mknmz --indexing-lang=en.UTF-8 -e ...
For indexing a Japanese UTF8 site I use (the -k means use kakasi):
mknmz --indexing-lang=ja.UTF-8 -k -e ...
For searching (I'm using PHP module by the way) I convert the search
keywords to EUC:
$kw_euc=mb_convert_encoding($kw,"EUC-JP","UTF8");
Then do the search, then for each search hit I convert the result back
from EUC to UTF8 ready for display, e.g.:
$title=mb_convert_encoding(
nmz_result_field($hlist,$n,'subject'),
'UTF8','EUC-JP');
Darren
_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en