namazu-users-en
[Top] [All Lists]

Re: Mailman & "Charactères Français"

To: namazu-users-en@namazu.org
Subject: Re: Mailman & "Charactères Français"
From: knok@daionet.gr.jp
Date: Tue, 03 Jun 2003 14:38:46 +0900
Reply-to: namazu-users-en@namazu.org
Message-id: <87adczlnyx.wl@localhost.knok.daionet.gr.jp>
At Mon,  2 Jun 2003 16:26:26 -0500,
dchartrand@scclab.com wrote:
Namazu is having problems displaying and understanding french characters such 
as
"àéèçô..." when used to search Mailman archives. A word like "troisième" is
displayed (and searched) as "troisime" in Namazu... Notice the missing "è".

In the past, I had got a same report. So I tried to check the probrem
with the following sequence:

1. Saved the mail <1054589186.3edbc102c03c0@scclab.com> as a text file
   named as "docs/french-text.txt".

2. Typed "LANG=C mknmz -O index ./docs" to make index.

3. Typed "LANG=C namazu -h ` sed -n '78p' index/NMZ.w` index > foo" to
   search the word "troisième", because I don't know how to input any
   french characters.

4. Checked the file foo. It is like the following:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
        "http://www.w3.org/TR/html4/strict.dtd";>
<html>
<head>
<!-- LINK-REV-MADE -->
<link rev=made href="mailto:webmaster@puti.knok.daionet.gr.jp";>
<!-- LINK-REV-MADE -->
<title>Namazu: a Full-Text Search Engine: &lt;troisième&gt;</title>

  :
(snip)
  :
<h2>Results:</h2>
<p>
References:  [ troisième: 1 ] 
  :
(snip)

Hmm, it seems no problem for me.

I am using Mailman 2.1 and Namazu 2.0.12.

How about your envrionment? The follwoing is mine:

Debian GNU/Linux (today's unstable)
Linux 2.4.21-pre4
glibc 2.3.1
-- 
NOKUBI Takatsugu
E-mail: knok@daionet.gr.jp
        knok@namazu.org / knok@debian.org

<Prev in Thread] Current Thread [Next in Thread>