procmail
[Top] [All Lists]

Re: stripping "quoted printable"

1998-08-04 00:08:02
On Mon, 03 Aug 1998 14:14:19 -0700, Bill Houle
<bhoule(_at_)conveyanced(_dot_)com> wrote:
To the best of my recollection, Eudora has always handled quoted-printable
mail just fine. However, I (and other Eudora users on this one particular
list, so it's not just me) have recently had a high occurence of
quoted-printable mail coming from Microsoft Exchange that the program
totally chokes on. (The error msg claims the text is "corrupted")
<...>
My real question is what action do I use in this rule once I've found a
candidate? Is there an easy way to strip these characters out? 
Exchange server, using the new name.=A0 My office has been using =
Exchange
for some time using the old company (organization) name.=A0=20
=A0=20                                                              
Not knowing too much about QP encoding, is it just a matter of looking for
'=..' triplets?

Additionally, QP wraps overlong lines and marks the wrap with a =
character. The current spec is in RFC2045; it shouldn't take long to
locate and read. 

Here's a quick and dirty Perl snippet to decode QP:

    perl -pe 's/=([0-9a-f][0-9a-f])/sprintf "%c", hex($1)/ige; s/=\n//'

Of course, you should also take care to fix the MIME headers to
indicate the correct encoding ("Content-Transfer-Encoding: 8bit", I
suppose, unless the contents are funny -- specifically, stuff with
very long lines in it should be C-T-E: binary.)

The proper way to do it is to install the Perl MIME module, or the
mmencode program from the metamail distribution. (It's a good one to
have anyway; it handles base64, too.)

Metamail is available from <ftp://thumper.bellcore.com/pub/nsb/mm2.7.tar.Z>

However, if the incoming QP is broken (as one might rightfully guess,
if the sender is Microsoft Exchange), it might not be possible to just
decode it yourself, or at least, you should expect to see imperfect
results. Perhaps you should post a copy of a broken message in
comp.mail.mime and ask for comments there.

FWIW, the =A0:s are hard spaces in ISO-8859 (&nbsp; in HTML) -- don't
ask me why all these moronic WYSIWYG mail editors insist on inserting
them everywhere.

Does Eudora specifically choke on stuff that has originally been
written in QP, then quoted with leading wedge characters? I think I've
seen some bozotic mail programs which would allow you to enter 8-bit
characters but then deduce from (quoted) =20:s in the contents that a
message was QP-encoded after all (or something like that) and thus
produce 8-bit messages with Content-Transfer-Encoding saying it's QP.

/* era */

-- 
 Paparazzi of the Net: No matter what you do to protect your privacy,
  they'll hunt you down and spam you. <http://www.iki.fi/~era/spam/>

<Prev in Thread] Current Thread [Next in Thread>