nmh-workers
[Top] [All Lists]

[Nmh-workers] repl*comps and and non-ascii characters

2008-07-17 23:01:26
I heard from Josh that nmh had good support for non-ascii
encodings these days (e.g. using iconv for rfc2047 rather than
the stupid MM_CHARSET environment variable), so I've been trying
to upgrade to nmh 1.3 and get this stuff working.

It doesn't work quite as well as I'd hoped.  scan decodes things
properly, but repl is still broken, at least by default: It
leaves 2047-encoded values encoded in the draft.  That's fine if
you're just going to send the message as is, but it would still
be nice to have readable headers while you're editing the draft.
Moreover, if you have non-ascii characters in the body (e.g. the
attribution), you need to encode those.  I use mh-e's C-c RET RET
(mh-mml-to-mime), which also encodes the headers *which were
already encoded*:

  To: =?ISO-8859-1?Q?Stefan_K=FCng?= <tortoisesvn(_at_)gmail(_dot_)com>

becomes

  To: =?us-ascii?Q?=3D=3FISO-8859-1=3FQ=3FStefan=5FK=3DFCng=3F=3D?=
   <tortoisesvn(_at_)gmail(_dot_)com>

That's no good, so I added decode calls around all these headers
in repl*comps:

--- /usr/local/nmh/etc/replcomps        2008-07-16 00:25:50.000000000 +0000
+++ replcomps   2008-07-17 19:19:41.000000000 +0000
@@ -1,26 +1,28 @@
-%; $Header: /sources/nmh/nmh/etc/replcomps,v 1.5 2003/07/02 02:01:50 
gbburkhardt Exp $
+%; based on /sources/nmh/nmh/etc/replcomps,v 1.5 2003/07/02 02:01:50 
gbburkhardt Exp $
+%; modified to decode headers that may include non-ascii characters
 %;
 %; These next lines slurp in lots of addresses for To: and cc:.
 %; Use with repl -query or else you may get flooded with addresses!
 %;
 %; If no To:/cc:/Fcc: text, we output empty fields for prompter to fill in.
 %;
-%(lit)%(formataddr{reply-to})\
-%(formataddr %<{from}%(void{from})%|%(void{apparently-from})%>)\
-%(formataddr{resent-to})\
-%(formataddr{prev-resent-to})\
-%(formataddr{x-to})\
-%(formataddr{apparently-to})\
+From: Eric Gillespie <epg(_at_)pretzelnet(_dot_)org>
+%(lit)%(formataddr(decode{reply-to}))\
+%(formataddr 
%<{from}%(void(decode{from}))%|%(void(decode{apparently-from}))%>)\
+%(formataddr(decode{resent-to}))\
+%(formataddr(decode{prev-resent-to}))\
+%(formataddr(decode{x-to}))\
+%(formataddr(decode{apparently-to}))\
 %(void(width))%(putaddr To: )
-%(lit)%(formataddr{to})\
-%(formataddr{cc})\
-%(formataddr{x-cc})\
-%(formataddr{resent-cc})\
-%(formataddr{prev-resent-cc})\
+%(lit)%(formataddr(decode{to}))\
+%(formataddr(decode{cc}))\
+%(formataddr(decode{x-cc}))\
+%(formataddr(decode{resent-cc}))\
+%(formataddr(decode{prev-resent-cc}))\
 %(formataddr(me))\
 %(void(width))%(putaddr cc: )
 Fcc: %<{fcc}%{fcc}%|+outbox%>
-Subject: %<{subject}Re: %{subject}%>
+Subject: %<{subject}Re: %(decode{subject})%>
 %;
 %; Make References: and In-reply-to: fields for threading.
 %; Use (void), (trim) and (putstr) to eat trailing whitespace.
@@ -29,8 +31,4 @@
 %<{message-id}References: \
 %<{references}%(void{references})%(trim)%(putstr) %>\
 %(void{message-id})%(trim)%(putstr)\n%>\
-Comments: In-reply-to \
-%<{from}%(void{from})%?(void{apparently-from})%|%(void{sender})%>\
-%(trim)%(putstr)\n\
-   message dated "%<(nodate{date})%{date}%|%(tws{date})%>."
 --------

--- /usr/local/nmh/etc/replgroupcomps   2008-07-16 00:25:50.000000000 +0000
+++ replgroupcomps      2008-07-16 00:19:22.000000000 +0000
@@ -1,4 +1,5 @@
-%; replgroupcomps
+%; based on /sources/nmh/nmh/etc/replgroupcomps,v 1.4 2003/07/02 02:01:50 
gbburkhardt Exp $
+%; modified to decode headers that may include non-ascii characters
 %;
 %; form (components) file for `repl -group'
 %;
@@ -20,16 +21,17 @@
 %;     cc              (and)
 %;     personal address
 %;
-%(lit)%(formataddr{mail-followup-to})\
+From: Eric Gillespie <epg(_at_)pretzelnet(_dot_)org>
+%(lit)%(formataddr(decode{mail-followup-to}))\
 %<(nonnull)%(void(width))%(putaddr To: )\n\
 %|\
-%(lit)%(formataddr 
%<{mail-reply-to}%?{reply-to}%?{from}%?{sender}%?{return-path}%>)\
+%(lit)%(formataddr 
%<(decode{mail-reply-to})%?(decode{reply-to})%?(decode{from})%?(decode{sender})%?(decode{return-path})%>)\
 %<(nonnull)%(void(width))%(putaddr To: )\n%>\
-%(lit)%(formataddr{to})%(formataddr{cc})%(formataddr(me))\
+%(lit)%(formataddr(decode{to}))%(formataddr(decode{cc}))%(formataddr(me))\
 %<(nonnull)%(void(width))%(putaddr cc: )\n%>%>\
 %;
 Fcc: %<{fcc}%{fcc}%|+outbox%>
-Subject: %<{subject}Re: %{subject}%>
+Subject: %<{subject}Re: %(decode{subject})%>
 %;
 %; Make References: and In-reply-to: fields for threading.
 %; Use (void), (trim) and (putstr) to eat trailing whitespace.
@@ -38,8 +40,4 @@
 %<{message-id}References: \
 %<{references}%(void{references})%(trim)%(putstr) %>\
 %(void{message-id})%(trim)%(putstr)\n%>\
-Comments: In-reply-to \
-%<{from}%(void{from})%?(void{apparently-from})%|%(void{sender})%>\
-%(trim)%(putstr)\n\
-   message dated "%<(nodate{date})%{date}%|%(tws{date})%>."
 --------

Here's what breaks, in replgroupcomps:

%(lit)%(formataddr 
%<(decode{mail-reply-to})%?(decode{reply-to})%?(decode{from})%?(decode{sender})%?(decode{return-path})%>)\

The FT_LS_DECODE case in sbr/fmt_scan.c:fmt_scan runs for each of
those fields: if the field is present, it calls decode_2047, and
if that returns non-zero it points str to the decoded text.
Next, the FT_IF_V_NE case runs for each field, which "skips" (?)
fmt ahead because value == fmt->f_un.f_u_value:

(gdb) p fmt->f_un.f_u_value
$12 = 0
(gdb) p value
$13 = 0

So, no matter what the contents of any of those fields were (if
any), I always end up with the last one (return-path) in the To:
of my draft.

So, I tried setting value in FT_LS_DECODE if the field was
present, and then this stops at the first found field just as it
did before I added the decode calls.  Is this the right
fix though?  This stuff is rather obscure...

--- ./sbr/fmt_scan.c.~1~        2008-04-05 18:41:37.000000000 +0000
+++ ./sbr/fmt_scan.c    2008-07-16 00:25:37.000000000 +0000
@@ -489,8 +489,11 @@ fmt_scan (struct format *format, char *s
            break;
 
        case FT_LS_DECODE:
-           if (str && decode_rfc2047(str, buffer2, sizeof(buffer2)))
-               str = buffer2;
+           if (str) {
+               if (decode_rfc2047(str, buffer2, sizeof(buffer2)))
+                   str = buffer2;
+               value = 1;
+           }
            break;
 
        case FT_LS_TRIM:


_______________________________________________
Nmh-workers mailing list
Nmh-workers(_at_)nongnu(_dot_)org
http://lists.nongnu.org/mailman/listinfo/nmh-workers

<Prev in Thread] Current Thread [Next in Thread>