procmail
[Top] [All Lists]

Re: Filter on Scandinavian characters in subject

2018-09-19 15:10:46


On 14 Sep 2018, at 09:57, Jostein Berntsen <jbernts(_at_)broadpark(_dot_)no> 
wrote:

On 14.09.18,17:34, Andreas Schamanek wrote:

On Fri, 14 Sep 2018, at 15:40, Jostein Berntsen wrote:

when I use recipes like these to filter messages with Scadinavian
characters (æ,ø,å) in Subject it fails to work. My locale is
nb_NO.UTF-8.  Is there a recipe that can be used to match these cases?

:0
* ^Subject:.*lån
innboks/IN-spam/

A proper message must have such characters encoded. Look at the source of
messages. You will see something like (for "lån")

 =?UTF-8?B?bMOlbg==?=
When you match against this (mind the ? und escape them as \?) it should
work.


Thanks. I solved it doing this:

:0 h
* ^Subject:.*=\?
SUBJECT=| formail -cXSubject: | perl -MEncode -ne 'print 
encode("UTF8",decode("MIME-Header",$_))'

By rewriting the message to include UTF-8 characters in the headers you have 
just made your message invalid as the mail headers can only contain 7-BIT ASCII 
and anything else must be encoded.

However, it's your mail, do as you will. You *will* have issues if you try to 
do something else with those messages, ever. Like, for example, import them 
into a different client. Or put them on an IMAP server.

Something for the manual maybe? 

No. Andreas gave you the right solution, match against the encoded text in the 
subject

:0
* ^Subject:.*\UTF-8\?\V\?bMOlbg

{ do stuff }

Or, save your UTF-8 decoded subject into a variable like UTFSUB=| formail…



-- 
Space Directive 723: Terraformers are expressly forbidden from
recreating Swindon.
____________________________________________________________
procmail mailing list -- procmail(_at_)lists(_dot_)rwth-aachen(_dot_)de   
Procmail homepage: http://www.procmail.org/
To unsubscribe send an email to 
procmail-leave(_at_)lists(_dot_)rwth-aachen(_dot_)de
https://lists.rwth-aachen.de/postorius/lists/procmail.lists.rwth-aachen.de