procmail
[Top] [All Lists]

Solved for: mime-encoded iso-8859-1, utf-8, and windows-1251 was: Re: Create a rule for utf8 subject

2015-10-08 09:34:30
On Tue, 6 Oct 2015, Jostein Berntsen wrote:

You can convert the Subject with "perl -MEncode". Check the approved answer
here:

http://stackoverflow.com/questions/29715013/decode-the-utf8-to-iso-8859-1-mail-subject-to-text-in-procmailrc-file

Thank you for a wonderful answer and trail-head.  I declare 
this to be the best post of the month ;)

I played with the sample, which handled only one format, and 
extended it to do some logging into the headers of the 
message, and it is working fine

Code extract handles iso-8859-1, utf-8, and windows-1251

email is coming into a plain jane CentOS 6 host with no 
third-party add on archive packages

The result:

Subject: Thank you for your Gander Mountain 
        =?ISO-8859-1?B?TWFzdGVyQ2Fy?=
         =?ISO-8859-1?B?ZK4=?= or Gander Mountain World
         =?ISO-8859-1?B?TWFzdGVyQ2FyZK4=?= account payment
MIME-Version: 1.0
Content-Type: multipart/alternative;
  
boundary="ABCD-1cbe41d68layfovcia3nhliiaaaaacft4ipodilt5siyaaaaa-EFGH"
X-mime-decode: iso-8859-1 old Subject:  Thank you for your 
        Gander Mountain
    =?ISO-8859-1?B?TWFzdGVyQ2Fy?=  =?ISO-8859-1?B?ZK4=?= or 
        Gander Mountain
    World  =?ISO-8859-1?B?TWFzdGVyQ2FyZK4=?= account payment 


I assume there are some non-translatable graphic 
typographic elements.  My goal was to get 'flat text' for 
easier procmail parsing in later recipies, and that is met

I've been using procmail for probably fifteen years, and have 
a rather complex set of rules by this time:

[herrold@charles ~]$ wc -l .procmailrc ./.procmail/*rc | 
        tail -n 1
 20701 total


from the recipe:

#       Decode the utf8 to ISO-8859-1 mail subject to text
#       
# http://stackoverflow.com/questions/29715013/decode-the-utf8-to-iso-8859-1
# -mail-subject-to-text-in-procmailrc-file
# Store a "potentially encoded" Subject: into SUBJECT
#       RPH: split to two phases; scan Subject and on hit mark 
# and save old; 
#               then amend
:0 f
* ^Subject:.*=\?iso-8859-1
* ^Subject:.*\/.*
        | formail -A "X-mime-decode: iso-8859-1 old Subject: $MATCH "

:0 h
* ^X-mime-decode: iso-8859-1
SUBJECT=| formail -cXSubject: | perl -MEncode -pe 
        '$_=encode("iso-8859-1",decode("MIME-Header",$_))'
##      undo the obvious line wrap in the line above and two 
more cases

##      RPH try to extend
:0 f
* ^Subject:.*=\?utf-8
* ^Subject:.*\/.*
        | formail -A "X-mime-decode: utf-8 old Subject: $MATCH "

:0 h
* ^X-mime-decode: utf-8
SUBJECT=| formail -cXSubject: | perl -MEncode -pe 
        '$_=encode("utf-8",decode("MIME-Header",$_))'

:0 f
* ^Subject:.*=\?windows-1251
* ^Subject:.*\/.*
        | formail -A "X-mime-decode: windows-1251 old Subject: $MATCH "

:0 h
* ^X-mime-decode: windows-1251
SUBJECT=| formail -cXSubject: | perl -MEncode -pe 
        '$_=encode("windows-1251",decode("MIME-Header",$_))'

# Store all remaining cases of Subject: into SUBJECT
:0 hE
SUBJECT=| formail -cXSubject
#
##################################################
#

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)de
http://mailman.rwth-aachen.de/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>