ietf-smtp
[Top] [All Lists]

Re: sequential processing (was Re: Abort data transfer?)

2009-11-20 23:48:05


Dave CROCKER wrote:
> Murray S. Kucherawy wrote:
>>> I don't understand how running the filters in series requires that
>>> the entire message be in a buffer.
>>
>> MTA buffers the entire message, sends it to filter #1.  Filter #1
>> changes the
>> body.  MTA sends the modified message to #2, including the new body.
>> This
>> can only happen if they're in series, and I can't see how it would be
>> possible if there's not a buffer involved.
>>
>> Here's an even better example:  MTA buffers the entire message, sends
>> it to
>> filter #1.  Filter #1 orders the message to be rejected (or discarded).
>> Filter #2 is told "nevermind", and never has to go through the
>> processing of
>> the body.  For a very large message, this can be a big performance
>> win, and
>> again can only happen if they're in series.
>
> Strictly speaking either fully-buffered or partial buffering allows
> processes to
> be staged in sequence just fine.  The only issue is whether the
> filters are staged in sequence with early filters feeding later ones.
>
> What full buffering does is to allow the current filter to change an
> earlier
> part of the message, based on a later part.  You can't do that in a
> partial
> buffering (hot potato) model whether the processing is done only on
> the current
> chunk and is then passed back.
>
> Simple example would be wanting to a header field to the message,
> based on
> something in the body.

Seems like we have two issues: Series/Parallel and Full/Partial
Buffering.  Running filters in series is clearly the better strategy
when the last filter is very time consuming, and earlier filters can
provide a quick reject.

Not necessarily. A strategy we use is to simply discontinue processing of
additionl milters when one returns a blanket rejection. As long as processing
is discontinued before the time-consuimg step (which very often is at the end
of body), the benefits of overlapped processing can far outweigh the costs.

But an additional comment is in order here. Optimizing things is all well and
good, but there's a point past which the costs of very elaborate (and often
very fragile) configurations far outweighs tiny gains in performance.

I've previously stated that we support running multiple external filtering
agents (milter is only one of the types we support) in both series and
parallel. But I can count the number of times a parallel setup has actually
ended up being appropriate on the fingers of one hand.

                                Ned

<Prev in Thread] Current Thread [Next in Thread>