ietf-smtp
[Top] [All Lists]

Re: Best practices to avoid virus and spam

2004-02-11 04:12:31

On Wed, Feb 11, 2004 at 04:57:19AM -0500, Bruce Lilly wrote:
Keld Jørn Simonsen wrote:

I would advice that we recommend some best practice procedures,
hopefully to be implemented in the MTAs software products of the world. 
Maybe we should write an RFC on this.

I have got three pieces of advice:

1. Always check for virus/spam before checking for valid reciepient, or
whether the mailbox is full or some such.

Checking recipient can be done at the time of RCPT TO.  Checking for
a virus can only be done after the body has been collected (end of
DATA).  RFC 2821 discusses difficulties that arise if rejection due
to invalid recipients is postponed until after DATA.  Checking body
content is only appropriate at initial (submission or first SMTP server)
or final stages of delivery -- an intermediate SMTP transfer is supposed
to be transparent to the data, and intermediaries have no business
making policy decisions for distant senders or recipients [of course,
there shouldn't be any intermediate relays except in unusual situations
(N.B. there may be intermediate servers as noted below which are not
relays, they are responsible for the sender's or recipient's mail domain)].

Your first suggestion conflicts with RFC 2821.

I think it would be OK to check it with the original receipt, or at the
final receipt. So "always" was not directed towards any intermediate MTAs.
Would it then still be in conflict with RFC 2821? Which section?

My idea is to check for whether the mail is a genuine post, or whether
it is a bogus mail. And only treat is as a genuine mail if we could not
classify it as a bogus mail. Then we can test it for all the normal
error conditions.

2. Generate a specific error message, maybe we should introduce a
standard error code for this, like 551 - mail rejected as virus or spam.

551 is reserved for "user not local". 550 is the error code for rejection
for policy reasons, but is not listed as valid after DATA.  554 (transaction
failed) is valid after DATA.

So here's another conflict (though returning 554 would be acceptable).

I did not mean to especially use 551, but a new code such as 555.
My idea was to get rid of all these different error messages
that I have tried to list in my filters. I have collected more than 300 
different 
messages essentially for the same thing. I think there should be a new
response error code just for this:
555 - we do not accept virus or spam here.
554 is too vague for the reason.

3. If the mail is virus or spam, then do not send it back to the
sender - as this is most likely a forged address anyway. Discard it
instead.  But if you must, then use the standard error mesaage as
described above. 

If SMTP were still universally used as intended (i.e. sender's MUA
connects to recipients SMTP server), that might work.  However, in
practice there is usually some intermediate SMTP or submission
protocol server.  If rejection takes place during final delivery,
an intermediate server (now acting as SMTP client to final delivery
SMTP server) has no choice but to send some sort of non-delivery
message (RFC 2821 section 4.4).

Well, I wanted as final receiving MTA to discard the message, the sender
address is bogus anyway.

So in many cases, there will still be a non-delivery message, and
in the absence of some means to verify authorization to use the
specified sender envelope address, such messages will continue to
cause collateral damage.  [E]SMTP AUTH provides authentication, but
as it is an extension, it can do nothing for plain SMTP transactions.

Plain SMTP transactions are rare, almost all transactions that 
I see on my mail server are ESMTP. SMTP is under 1 promille.

While there's no conflict, in practice your third suggestion will
have limited effect.

I understand that almost the same amount of error mail will be
generated. The idea, however, is that the receiver of the error
mail will have a much better understanding of which kind of error mail
it is, and for those receivers that are confident that they do not
send out virus or spam, they can just ignore the bogus error mail.

The bogus error mail is - at least for me personnally - the origins of
almost all unwanted email today.

I think with this scheme, we would have avoided alomost all of the
virus/spam and also annoying error traffic .

Maybe.  But that's already history.  There are plenty of open relays,
and so long as there is at least one, any future virus/worm could
use that relay, and the payload would be delivered (either to the
intended recipient or to the forged return address).  Likewise for
spam. Closing open relays is therefore a prerequisite to successful
implementation of any scheme such as you have proposed.

Yes, the virus and spam mail would be delivered to the MTA's.
But not to the MUAs. And that is the main problem. I don't care too much
about what my machine does, as long as the load is beyond 0,1 and I dont
have to waste my personal time on looking at the mails. And my mail
server (which is also an ftp and http-server and other things)
happily chunks away on 100.000 emails a day, of which 99.800
are bogus mail, which get rejected or discarded. No sweat.

It does not matter really about the open relays, there will always be
some, and of cause these should be reported, and appropiate
RBLs be employed. Still what I am after is limiting the emails that
end up in the user email inboxes (like my own), and I think what I
suggest would get a long way to reach that goal.

No need for new protocols, closed networks etc. Maybe a need for some
RBL listing virus/spam infected machines, I don't know.

Aside from the need to close open relays and the need for verification
that the specified sender envelope address is appropriate, there are
other issues that make your scheme impractical.  Scanning body content
can only take place at the end of DATA during acceptance (or rejection)
for delivery.  The minimum timeout for a response at end of
DATA is 10 minutes. Assuming all clients waited for at least that
long (some do not), that timeout imposes an upper limit on the
amount of time available for processing to determine whether or
not there is a basis for rejection on policy grounds.  If a large
number of simultaneous connections are being handled by an SMTP
receiver, that may present an unacceptable processing burden.

That is true, it will take more CPU time to process.
But the idea is that the resulting amount of mail will take less human
time to process. This is a tradeoff, and as long as I have the CPU
cycles, I know myself how I want to have it done. CPU is rather cheap
these days.

Another way to handle some of this is to have RBLs of infected
machines, and then reject/discard the mail upfront.

From a practical point of view, anti-virus signatures generally only
come out after a virus has propagated fairly widely, so for some
time after a virus is unleashed it will be delivered undetected even
by those who keep their AV software up-to-date.  The same considerations
(must take place after DATA is complete at final delivery SMTP
receiver, requires processing overhead, arms-race considerations)
apply to spam.

I use general rejection of .exe .pif .scr .com etc extentions,
this really catches most of the virus. And I really cannot see
a valid reason for sending these attachment types by mail,
which cannot be done in another more secure way.

I think that the updated antivirus info generally gets widely
distributed in about a day and a half. So yes, the virus have about that
time to disseminate. Still we see MyDoom and other vira floating around
a long time after that outbreak period. This could be stopped.

Another way to limit the outbreak (or incumbination?) period is to 
apply RBLs for the infected machines. This would give a much shorter
time, as antivirus info is something that is distributed and fetched by
the individual users on a daily basis, while RBLs are dynamic and effective
almost immediately (save zone copies) via the DNS. 

RBLs are also cheaper to employ as the mail can be discarded upfront,
while antivirus/antispam requires scanning of the data, which costs more
CPU, and network bandwidth.

In the case of virus content, it's difficult to justify requiring a
site with no Microsoft platforms to install, maintain, and operate
anti-virus software due to poor design of Microsoft products and/or
inadequately trained or incompetent users of same.

Yes, here RBLs would also be a bonus.
Anyway just discarding mails with attachments of said types would
get you almost all of the virus.

One generally
doesn't allow toddlers to operate plastic tricycles on busy highways;
likewise those not competent to practice minimal security and those
whose systems fail to meet some minimum security standards shouldn't
be allowed on the "information superhighway" (apologies for
resurrecting that tired cliché).  An RBL is unlikely to work; the
existence of ISPs that use a pool of IP addresses for dialup and
broadband subscribers via DHCP means that IP addresses cannot be
used to identify infected machines -- one minute a particular IP
address might be assigned to a virus-infected machine running an
insecure operating system and operated by a clueless user, and the
next minute it might be assigned to a firewalled system operated
by a security expert.  If there's an answer to this aspect of the
spam/virus problem short of universal IPv6 deployment and static IP
addresses everywhere, it will probably have to be a policy/licensing
(analogous to a driver's license) matter rather than a purely
technical solution.

Yes, you may have a point there, I have not looked into that specific issue
of how stable the ip addresses of infected machines are.
Anyway that would then reflect on the ISPs that they have spammers
or virus infected machines among their customers, and then it is the
ISP's problem to avoid that. 

If you have a machine with a fixed ip-address, however, then there
would not be any problem for that MTA. And MTAs on dynamic IP is
questionable anyhow.

best regards
keld