ietf-smtp
[Top] [All Lists]

Re: Best practices to avoid virus and spam

2004-02-12 05:56:26


On Feb 12, 2004, at 12:53 AM, Keld Jørn Simonsen wrote:

On Wed, Feb 11, 2004 at 10:47:00PM -0500, Keith Moore wrote:

1. Always check for virus/spam before checking for valid reciepient, or
whether the mailbox is full or some such.

Why?  Even if SMTP made it easy to do this (it doesn't), offhand it
seems better to perform the easiest and/or more reliable checks first.
It's very difficult to be sure that a message is spam - the same
message can be spam for one recipient and valid for another.  However
the authoritative MX for a domain ought to be able to tell whether an
recipient address from that domain is valid.  And to me it makes more
sense to reject a message for a reliable reason (like, this recipient
address is not valid) than for an unreliable reason (like, we think
this is spam).

Yes, many things are easier to check upfront than whether this is virus
or spam - the latter takes a full scan of the body. And then I think
these easy checks should be done as long as they also reliably can
classify the mail as forged, such as bogus MX and no PTR and some such.

neither of these checks can reliably classify the email as forged.

My aim is, still, to get rid of bogus error mail.
And as far as I see it in my own mailbox, this is mail of the
following kinds, *only*:

1. reports that the body contained virus
2. reports that the user is unknown
3. reports that the user's account has been closed.
4. reports that the user's mailbox is full.

I think we'd agree that #1 should not be sent to the "sender" of the virus since it is so often forged. But reports #2-4 are often useful, and they _should_ be detected before the server has a chance to look at the content of the message.

I see your point about the uncertainty of spam detection, and I would be happy
enough if we only address the problem of virus. Virus can be determined
with certainty, (although some may not be found).

I am pretty happy with the current technology on spamdetection with
spamassasin doing a good job for me.

I'm pretty unhappy with spamassassin myself, because many of its criteria are bogus. It's a useful tool when tuned properly, but I see it as promoting bad practices.

2. Generate a specific error message, maybe we should introduce a
standard error code for this, like 551 - mail rejected as virus or
spam.

3. If the mail is virus or spam, then do not send it back to the
sender - as this is most likely a forged address anyway. Discard it
instead.  But if you must, then use the standard error mesaage as
described above.

I see no need to define a new SMTP reply code - an enhanced status code
should suffice.  Especially since, as you point out, when the SMTP
server does believe that the message is spam, bouncing the message is
usually the wrong thing to do anyway.

I would accept not specifying a standardized best practice on error reply
for virus and spam, and only recommend discarding, but don't you think
that some may disagree with discarding such mail, and then it would be
good advice that they at least use som standardized error reply in their
rejects?

yes, I do think that having uniform reporting would be useful.

I would be happy not to make a new SMTP error reply code if this can be
done better with some other reply means. What would you suggest?

DSNs and ENHANCEDSTATUSCODES.

I think with this scheme, we would have avoided alomost all of the
virus/spam and also annoying error traffic .

No need for new protocols, closed networks etc. Maybe a need for some
RBL listing virus/spam infected machines, I don't know.

Third-party RBLs are a really, really, really bad idea. They should be
categorized as Worst Practices.

I would appreciate if you could explain this in detail to me.

I have yet to see a third party RBL that didn't misrepresent what it was reporting - either in that the information being reported was often stale, or in that there was an attempt to mislead people about the validity of the criteria they were using. For instance, the RBLs that blocked mail from machines that appeared to relay mail was a dubious criterion because there were valid reasons to relay mail and not all or even most mail from such machines was spam. (Now with SMTP authentication there are fewer valid reasons). RBLs that accept reports from other parties spread misinformation. For instance, if a site uses spamassassin's criteria as a way to decide whether to report to an RBL that a particular IP is sending spam that is spreading unreliable information. ("propagation of error")

There might be some cases where the _authoritative_ DNS zone for a MAIL
FROM domain or  the _authoritative_ DNS zone for the source IP address
of an SMTP connection says "messages that look like this are not
authorized".  And in some of those cases the SMTP server might be able
to tell that a message "looks like this" from just MAIL FROM and/or
RCPT TO.  In those cases it _might_ make sense for the SMTP server to
return 2xx in response to those commands (if they are otherwise valid)
just so that it can black-hole the message.  Though I think it might
make even more sense to return 4xx in response to those commands - or
just stop responding to that connection - and then refuse to accept any
new connections from that source IP for some period of time.  The
rationale here is that the SMTP server doesn't want to waste bandwidth
or cpu cycles by sucking down a message that's going to be discarded
anyway, and it does want to avoid being interrupted by a compromised
machine or spammer's machine if it tries again later.

My analysis is that for a number of servers the CPU load stemming from
this kind of bogus mail is not a problem.

I'm not sure what you mean by "analysis" here. But my limited experience is that many mail servers spend much more time and bandwidth processing spam than processing useful mail. (especially if you count the effort spent trying to bounce spam)