namazu-users-en
[Top] [All Lists]

[Namazu-users-en] Re: mknmz notworkingforJapanese languagedocuments ?

2006-06-29 19:09:59
Darren Cook wrote:

I've not used the perl modules, but I can tell you what I do on a site
that isn't native EUC.

For indexing an English UTF8 site I use:
 mknmz --indexing-lang=en.UTF-8 -e ...

For indexing a Japanese UTF8 site I use (the -k means use kakasi):
 mknmz --indexing-lang=ja.UTF-8 -k -e ...

For searching (I'm using PHP module by the way) I convert the search
keywords to EUC:
 $kw_euc=mb_convert_encoding($kw,"EUC-JP","UTF8");

Then do the search, then for each search hit I convert the result back
from EUC to UTF8 ready for display, e.g.:
 $title=mb_convert_encoding(
      nmz_result_field($hlist,$n,'subject'),
      'UTF8','EUC-JP');
 

Our documents may be of different encodings.
Do I need to have a seperate site for each different encoding then ?

J. Hart


_______________________________________________
Namazu-users-en mailing list
Namazu-users-en(_at_)namazu(_dot_)org
http://www.namazu.org/cgi-bin/mailman/listinfo/namazu-users-en

<Prev in Thread] Current Thread [Next in Thread>