procmail
[Top] [All Lists]

Re: Which characters to quote ?

2005-04-04 08:21:10
On Apr 4, 2005 7:40 AM, Ruud H.G. van Tol <rvtol(_at_)isolution(_dot_)nl> wrote:

As David already mentioned, procmail is 8-bit clean.

Whether that includes zero-bytes to, I have never tested.

Procmail is 8bit clean, but it doesn't know about wide characters.  If
the subject is encoded in some variant of Unicode such as UTF-8, then
the "." regex metacharacter will only match the first byte of any
multibyte characters that are present.

This also means that if the .procmailrc file is edited in an ISO
single-byte character set but the subject is encoded in Unicode, or
vice-versa, then characters that look the same on the screen may not
match in a regex comparison.  Procmail does raw byte matching, not
locale-aware character equivalence.

Further problems exist with e.g. character classes, e.g. in [xyz] if y
were a wide character, the character class would consist of z, x, and
the 2 to 4 bytes that make up y, treated as alternates rather than as
a unit.

____________________________________________________________
procmail mailing list   Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail