namazu-users-en
[Top] [All Lists]

searching in Japanese (Shift JIS) on an english-based Slackware 7 .1

2000-12-06 13:31:14
Hi.
I've been trying to configure Namazu to index and search some HTML pages that 
are in the Shift JIS character set. I'm using Slackware Linux 7.1, Perl 5.6 and 
Apache 1.3.12.

After reading all of the documentation that I could find, I tried setting up 
everything using a number of different values for the LANG environment variable 
(ja, ja_JP.sjis, euc).

When I used mknmz (with kakasi), it always reports the same number of keywords 
indexed. There are 150 documents, which are mostly articles, and that only 
produces 2541 keywords. That number seems a little low. I believe that mknmz is 
only indexing the English words in the documents, because searching for any of 
them returns success.

namazu.cgi understands the value for Lang as set in .namazurc, because it 
prints out the search form and results pages in the right character set.

If anyone can offer any further assistance with how to get some japanese 
searches going, I would greatly appreciate it.

Thanks
Brian


<Prev in Thread] Current Thread [Next in Thread>