ietf-822
[Top] [All Lists]

Re: Getting RFC 2047 encoding right

2003-12-11 11:09:06

My decode2047() was liberal, and my encode2047() was conservative, and I found myself in a position where the code would nevertheless generate monstrosities like "Re: =?latin_1?q?=80?=' (which should be "Re: =?iso-8859-15?q?=A4?=" but can be decoded).

I don't think I understand what you are saying. Would you really use "latin_1" as a charset name? Given that it's nonstandard, how could that be conservative?

I assume the 0x80 is a non-break-space? Other than using a nonstandard charset, what is it that makes
=?latin_1?q?=80?= a monstrosity?

And how does this relate to the problem of not changing encoded-words from the subject message?

After spending a few hours trying to rewrite the decoder such that I could know whether the raw text was something a conservative encoder could make, I decided that the result wasn't worth the complexity.

why would the decoder need to care?

to me this seems fairly simple:

struct msg *newmsg, *savmsg;

/* initialize two message structures; one for the message to be edited;
   another to save decoded fields from the message being replied-to */

newmsg = new_message();
savmsg = new_message ();

savemsg->subject = newmsg->subject = prepend_Re (decode_2047 (origmsg->subject));
...
edit_message (newmsg);
...
/* now that the message has been edited,
encode any header fields from the edited message that contain non ASCII chars
   if the user didn't change the field, use the original encoding */

if (strcmp (savemsg->subject, newmsg->subject) == 0)
        newmsg->subject = prepend_Re (origmsg->subject);
else
         newmsg->subject = encode_2047 (newmsg->subject);

        
what am I missing?