ietf-mta-filters
[Top] [All Lists]

Re: "body" extension

2002-06-14 14:10:51

Sorry about that--I'm just getting used to the fact that ctrl-return
sends messages in this mailer...

On Fri, 2002-06-14 at 11:39, Simon Josefsson wrote:

Jutta Degener <jutta(_at_)sendmail(_dot_)com> writes:

Below is a strawman Internet draft that describes the easiest,
most general form of "body" I could imagine -- a simple match
against the text of an e-mail message that is not the header,
without content-decoding of any sort.

A future extension of this could be to even MIME decode data before
matching it as well -- international users need this to match
non-ASCII in bodies.

So anyway, speaking only for myself:

I've implemented that in a different filtering system, and it turns out
to be kind of useful.  But the gotchas are severe.

(1) people will try to use it to parse MIME.
(2) people will complain that it doesn't do i18n.
(3) it's useless outside of ASCII, because the messages are invariably
not in Unicode, but the input messages are.

Doing this right (charset decoding with comparators) is hard, but is the
right thing.  The quick and dirty body extension--especially if it uses
the "body" identifier for labeling--will eventually, if not immediately,
have to be obsoleted by a more general and complete mechanism.

I've seen several mailers recently that send perfectly reasonable 7bit
parts as base64 as well.

I believe that if we're going to go down this path, we might as well do
it right and specify MIME handling.

Finally, body searching really is expensive.  It doesn't take a very
naive implementation of regex to make body matching slow; :matches is
actually quite sufficient if it's recursive.


Specific comments on the draft:

I do not believe this is the right form for "body", unless there's a
required tagged argument ("body :raw").  If someone asks for "body",
this probably isn't what they want.

\n is not a valid sequence in quoted strings (nor is it a line delimiter
in an RFC822 message)

It doesn't take a very naive implementation of regex to make body
matching slow; :matches is actually quite sufficient if it's recursive.

Tim



<Prev in Thread] Current Thread [Next in Thread>