Re: [ietf-smtp] [Shutup] Proposed Charter for the "SMTP Headers Unhealthy To User Privacy" WG (fwd)

2015-12-01 15:11:58
In message 
Arnt Gulbrandsen <arnt(_at_)gulbrandsen(_dot_)priv(_dot_)no> writes

I think it might be helpful if someone would name or describe a largish 
site or service provider that uses Recived, and describe how in useful 
detail. "A spam/virus filter company that handles mail for about x million 
mailboxes does ...", that kind of thing.

All of the big mailbox providers ("MAGY", and of course many more
providers as well) use large and complex machine learning systems for
spam filtering. Consult the company's annual reports for the value of X
in "x million" because it rather depends how you count.

These ML systems use any and all features of incoming email in order to
determine whether email should be rejected, placed into a spam folder or
placed in an inbox.

All email, spam or not, contains a rich set of information within the
body (which the spammers can control) and the header fields (which they
can of course forge parts of, but by no means all). The ML doesn't
really care how much of the information is forged (and many spammers do
not bother because it doesn't help much). The ML is looking for
patterns, or the lack of patterns.

The richness of the information in the header fields is an important
reason why the ML systems operate at up to "4 nines" of accuracy.

I doubt you're going to get told much more than that because these
systems are complex, proprietary and there is an understandable concern
(and indeed a great deal of experience) that spammers will exploit
nuggets of information to improve their delivery scores.

Received headers also assist in loop detection and in helping
troubleshooters to understand the cause of delivery problems and the
location of compromises -- and they are valuable for that alone.

People are used to email "just working" ... there's a lot of folks
peering at the entrails to make that happen and they need all the help
they can get!

If you want to have very limited data in your email header fields then
you should look at the systems that you operate yourself and clean up
the information at that point. You'll probably get a poorer delivery
experience when sending to MAGY and others -- but that's your tradeoff.

richard                                                   Richard Clayton

Those who would give up essential Liberty, to purchase a little temporary 
Safety, deserve neither Liberty nor Safety. Benjamin Franklin 11 Nov 1755

