procmail
[Top] [All Lists]

Re: Special characters

2001-05-30 20:11:27
* Cobrazul <cobrazul(_at_)uol(_dot_)com(_dot_)br> [010530 17:36] wrote:

Hi,

How can I avoid this? (subject line)

From someone(_at_)somewhere(_dot_)net  Tue May 29 22:17:43 2001
 Subject: Re: Ol=?ISO-8859-1?B?4Q==?=, John
  Folder: /home/user/messages.pl

Or, how can I convert the ISO string back to original form?

This requires either an external program which understands the encoding,
or a recursive INCLUDERC and some more-or-less standard utilities
(mimencode, formail, print or echo, and perl) to iterate through the 
encoded headers. Before moving to a mail client (mutt) which does the
decoding and displays the result properly, I used a homegrown INCLUDERC,
shown below. The note about disarming control characters protects
against low-order characters which might do odd things to your screen,
or worse. They are replaced by '?' or any other character of your
choice. I use 'print -n' but you may use 'echo' together with an option
to supress newlines appropriate to your system.

Why does it need a recursive INCLUDERC? It needs one because there can
be more than one encoded header (even more than one Subject: header). If
you are happy just to handle the first Subject: header you can do it
without the recursion.

This adds an 'X-Munged:' header describing each conversion performed.


There well may be more efficient ways to do this. This was in regular
use until I moved from elm to mutt on 2000/1/1. I don't remember why I
added the guard against un-encoded =?. Probably just being careful.

As usual, [    ] is a space and a tab in brackets.


In the main rc file, you need the following five lines:

  mimehdr_=[^o][^l].*:.*=\\?(iso-8859-1|utf-8)\\?[bq]\\?[^?]+\\?.*
  bq=BbB?bQqQ?q               ## upper case to lower case converter string
  :0                          ## if there are any ISO-encoded headers
  * $ ^\/$mimehdr_
  { INCLUDERC=$pmrc/convertISO.rc }

And in file $pmrc/convertISO.rc, the following:

## save MATCH, reset temporary variables
  hdrtxt=$MATCH
  hdr txtb txte code
## extract header name (with trailing space) for regeneration and reporting
  :0
  * MATCH ?? ^^\/[^:]+:[        ]
  { hdr=$MATCH }
## guard against non-coded =? in header by dropping the trailing
## three characters preceding the encoded part
  :0
  * hdrtxt ?? ^^[^      ]+:[    ]+\/.*=\?[iu]
  * MATCH ?? ()\/.+[^iu]
  * MATCH ?? ()\/.+[^?]
  * MATCH ?? ()\/.+[^=]
  { txtb=$MATCH }
## grab trailing plain text (which may be empty)
  :0
  * $ hdrtxt ?? $\txtb=\?.+\?.\?.+\?=\/.*
  { txte=$MATCH }
## grab encoded text, never empty
  :0
  * $ hdrtxt ?? $\txtb=\?.+\?.\?\/[^?]+
  { code=$MATCH }
## grab encoding type and translate it to lower case and rewrite
## header with decoded text, disarm control characters, and report
  :0 f h w
  * hdrtxt ?? \?\/[bq]\?
  * $ bq ?? $\MATCH\/.
  | formail -i "$hdr$txtb$(print -n $code | mimencode \
    -u -p -$MATCH | perl -pe 's/[\00-\010\012-\037]/?/g')$txte" \
          -A "X-Munged: ${hdr}converted from $MATCH"
## look for more, if found, recurse
  :0 a
  * $ ^\/$mimehdr__
  { INCLUDERC=$_ }





-- 
Rik Kabel      old enough to be an adult      rik(_at_)panix(_dot_)com
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>