Maybe, but HEADER analysis might be helpful.
I don't think I quite understand your point here.
A mail transaction typically goes like this:
<tcp connection> from high port to port 25 of the server
The server checks up the reverse DNS of the client.
S: 220 <server.hostname> [ESMTP]
C: HELO <client.hostname> (or EHLO for ESMTP)
S: 250 Ok
C: MAIL FROM:<sender(_at_)example(_dot_)com>
S: 250 Ok
C: RCPT TO:<recipient(_at_)example(_dot_)net>
S: 250 Ok
... (client adds more RCPT TOs here)
C: Data
S: 354 Send CRLF.CRLF to end
Message headers
Message Body
.
S: 250 Ok
quit
S: 221 Ok
<tcp connection teardown>
Of course.
Now, you know the actual message size only after the data was sent.
Well, you know the maximum OFFERED size by then.
IF during the transfer, the recipient's inbox overflows (and, if they are the
ONLY addressee there for the message!) you no longer have to be concerned about
how much more there might be in the wings... once you've decided that the
message is "too big" and is going to be bounced. You don't HAVE to keep
receiving the rest of it, at that point.
If the client sends a ehlo, it may send a size value, but that cannot be
known to be correct. The message headers and body are part of the
message itself, as part as SMTP is concerned.
Sure, but that doesn't mean that the connection can't be dropped DURING the
body
transfer.
After the receiving mail server has gotten the complete full header, then it
knows not only who the intended recipient is but also who the stated sender
is, and that could permit it to establish what that recipient's permissions
are for that sender... and SOME of those, at least, could perhaps be
enforced during the time the incoming mail arrival is being done.
This is more complex, but doable. A restrictions lookup table keys off
the recipient address, and
Right.
The easiest one to enforce is maximum allowed message size for that
sender (or default 'unknown' sender) since that one already has to be
done anyhow if the user's maximum inbox size is exceeded (and that isn't
known, either, until too much data comes in).
Actually, this isn't known until delivery is attempted.
Right, at the final mail server.
...No one needs to have the mail storage system on the same server as the
receiving MTA. The complexity and fraility of trying to push quota overflow
conditions in realtime to external MTAs is bad enough that I would just allow
for bounces to be generated later.
Certainly a reasonable enough position.
It's a little more complicated (but not HUGELY difficult) to have the
receiving server recognize message divisions and enforce (at least on a
preliminary, tentative basis) recipient permissions for attachments,
HTML-burdened attachments, and such.
This requires a large amount of code to be pushed into MTAs.
Oh, not really. I wouldn't try to do 'full' content analysis there, but it
MIGHT make sense to do some levels of stuff (binary attachments and their
extensions, maybe).
Parsing text takes up CPU. Gobs of CPU.
A lot of that depends on what you're trying to do, and what kind of tools
you're
trying to do it with.
Just to show how much pre data filtering can help (I wont quote names here,
since I don't have permission to do that yet):
A canadian ISP had 4 boxes handing 200K mails/day each on average, with
SpamAssassin filtering the content for suspicious mail. Note that this was
post recipient validation, with a few externsions filtered out
(exe/cpl/pif/.., the common viral vectors). SA was running on 9 boxes,
and those were barely able to keep up with the load (SA was daemonised).
By adding a single DNSBL (the cbl-sbl.spamhaus.org list), they reduced
the inflow to 80K mails/day per host on average, and need 6 SA boxes for
filtering comfortably.
Okay, but I wasn't suggesting anything nearly as complicated as SpamAssassin at
the ISP end (I've always said that this ought to be at the recipient end, where
processing power is cheaper and more plentiful).
SpamAssassin is probably written in Perl or something, and that language is NOT
particularly efficient (especially compared to something that's more powerful
and more efficient for textual analysis and pattern recognition, such as
SPITBOL).
Let's recall that SpamAssassin was DESIGNED to be run at the end-user machine,
and it didn't HAVE to be fast or efficient.
I'll let you figure out the bandwidth and server savings, and the
administrative time saved on that (the admins had been trying for three
months at least to tune the SA boxes so that they would keep up).
I am not convinced that the effort is worthwhile, since it's so
implementation-dependent in a moving-target situation.
But (although there is a potential cost savings from truncating such mails
earlier in the process, before they have been fully transferred) there is a
downside which may override the potential savings... and that is the
When do you plan to show actual savings? There are no savings post DATA,
except the cost of putting that message in the temporary mailstore on
the server.
Depends on how far up the chain you are, and how many forwards will be done
between the sender and the recipient, and how much data transfer can be
avoided.
But like I said, any such savings may not be worth the extra complexity.
recipient's desire to be able to change their mind upon considering the
"spam" filtering decision... it's harder to do that, and go ahead and
approve the message for delivery (and possibly revising the filtering
Have you considered that message may have more than one recipient, and
that enforcing different message sizes per recipient in the MTA is simply
not feasible (there is exactly one return value after data)?
I hadn't considered that, but it's certainly a valid point... and yet another
reason why (based on complexity cost) it simply might make the most sense to do
the filtering at the recipient end.
rules for that sender), if in fact it was blocked or the server connection
was dropped during transfer. It may be that the bandwidth cost (and
that's ultimately getting cheaper and cheaper) is simply less costly than
It is? Not as fast as the spam volume is going up.
Obviously we want to reduce spam volume. My approach is all about that.
the added complexity of trying to be more clever, "too" early in the
process.
Gordon Peterson http://personal.terabites.com/
1977-2002 Twenty-fifth anniversary year of Local Area Networking!
Support free and fair US elections! http://stickers.defend-democracy.org
12/19/98: Partisan Republicans scornfully ignore the voters they "represent".
12/09/00: the date the Republican Party took down democracy in America.
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg