[Top] [All Lists]

Re: mhfixmsg character set conversion

2022-02-08 06:16:51
Hi Steven,

There's still something going on that I don't understand, however.  The
way I've evaluated the output from mhfixmsg was by viewing it in vim, and
there's no question that the unpatched output looks fine while the patched
output is as I've been describing since the beginning of this thread.

Good.  BTW, to begin a thread, please don't reply to an existing message
on the list and change the subject as it doesn't start a new thread and
leads to weird presentation in the archives, threading trees, etc.

...but when I look at the files with command-line tools such as more or
head, *both* versions look correct.

Have you patched more or head?  ;-)

Can you cut-and-paste commands and output from your terminal to show us
the problem.  Otherwise we have to trust your competency, no offence
intended, and imagine what was done and seen which adds to the effort in
dealing with the email.  Here's my go.

How I could be influencing programs.

    $ locale

Test inputs.

    $ cat good
    Veuillez ne pas répondre au présent courriel. Il a été généré
    automatiquement, nous ne pourrons pas y donner suite.
    $ cat bad
    Veuillez ne pas répondre au présent courriel. Il a été généré
    automatiquement, nous ne pourrons pas y donner suite.

bad is double-encoded.

    $ iconv -f iso-8859-1 -t utf-8 good | cmp - bad

head(1) and more(1) don't disguise that.

    $ head bad
    Veuillez ne pas répondre au présent courriel. Il a été généré
    automatiquement, nous ne pourrons pas y donner suite.
    $ more bad
    Veuillez ne pas répondre au présent courriel. Il a été généré
    automatiquement, nous ne pourrons pas y donner suite.

Show the hex values of non-ASCII bytes.

    $ LC_ALL=C perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' good
    Veuillez ne pas r<c3><a9>pondre au pr<c3><a9>sent courriel. Il a 
<c3><a9>t<c3><a9> g<c3><a9>n<c3><a9>r<c3><a9>
    automatiquement, nous ne pourrons pas y donner suite.
    $ LC_ALL=C perl -lpe 's/[^ -~]/sprintf "<%02x>", ord($&)/ge' bad
    Veuillez ne pas r<c3><83><c2><a9>pondre au pr<c3><83><c2><a9>sent courriel. 
Il a <c3><83><c2><a9>t<c3><83><c2><a9> 
    automatiquement, nous ne pourrons pas y donner suite.

Cheers, Ralph.

<Prev in Thread] Current Thread [Next in Thread>