[Top] [All Lists]

Re: "body" extension

2002-06-15 09:46:21

On Fri, Jun 14, 2002 at 03:09:18PM -0600, Tim Showalter wrote:
Doing this right (charset decoding with comparators) is hard, but is the
right thing.

I think that's the right thing for some applications,
but not for others.

I'm thinking about the application you describe in the
form of another extension I'm calling "text".  (You're not
the only person who wants this to be part of "body" and
triggered with a flag or by its absence; I'm resisting this
mostly because I want to get "body" out to closely resemble
existing practice, and I haven't yet heard that this kind
of behavior is existing practice.  Please set me straight if
you know differently!)

The "text" extension decodes transfer-encodings and knows
about charsets.  It translates as much as the server can
(quality of service issue here) into plain text, then searches
that plain text for a given search string.  If you want to
prevent your employees from receiving e-mail that contains
the words "willing teenage girls", that's where you go.

(There could be a third in the canon, working-title "content",
which transfer-encodes, but doesn't go any further.  That's
probably what you want if you are trying to implement a virus
scanner that looks for signature strings in binary files.)

Finally, body searching really is expensive.  It doesn't take a very
naive implementation of regex to make body matching slow; :matches is
actually quite sufficient if it's recursive.

Good point; I'll add "matches" to the list of warnings.

\n is not a valid sequence in quoted strings (nor is it a line delimiter
in an RFC822 message)

Ouch.  Thanks.  That whole example is a bit too specific for a
real draft -- I doubt that people will still understand it in a 
few years.

I do not believe this is the right form for "body", unless there's a
required tagged argument ("body :raw").  If someone asks for "body",
this probably isn't what they want.

What I would like to know - and I think you have that information,
but you haven't put it into your reply - is what's _actually implemented
right now_.  What are the syntax and semantics of your existing
"body" command?


<Prev in Thread] Current Thread [Next in Thread>