ietf-dkim
[Top] [All Lists]

Re: [ietf-dkim] New canonicalizations

2011-05-16 21:38:05
Alessandro Vesely wrote:
On 16/May/11 19:00, Michael Thomas wrote:
My guess is that admins just don't understand any of the subtleties,
have heard lore that "relaxed" is "better" and just click "relaxed"
wherever they find it. It may also be the case that some implementations
don't even have separate nerd knobs for headers and body canonicalization.

However, Murray's stats show some difference in the choice of relaxed:

...

For the body count, we have 74% relaxed vs 26% simple, while it is 86%
relaxed vs 14% simple for the header.  There is a 12% difference
toward relaxing the header, which implies some thought or testing.

Its hard to imagine and DKIM explorer will do so without some 
technical forethought and eventual testing; internal but also 
external.  Our auto-responder log shows the testing is active (but 
mostly by the same people).

But I think its important to get a Domain Analysis to see the 
isolations, if any. I just did a quick C14N analysis of 9000+ DKIM 
signed messages coming to my system and what it appear clear to me 
that many are using whatever default settings come with their DKIM 
package, API and/or straight from the specs.

When the volume was reduced to unique domains, there were 208 unique 
domains with a c= breakdown:

    31   c=simple/simple
    10   c=simple
    23   c=relaxed/simple
     8   c=relaxed
   134   c=relaxed/relaxed
     2   c=simple/relaxed

When you fold c=simple with simple/simple and c=relaxed with 
relaxed/simple

    41   c=simple/simple    (DKIM default)
    31   c=relaxed/simple
   134   c=relaxed/relaxed
     2   c=simple/relaxed

Based on this:

   79.4% domains use relaxed for headers
   65.4% domains use relaxed for body

Since the default is simple/simple, this clearly shows the majority 
domains (for whatever reason) are conscious of using relaxed.

Now looking at the actual 208 list of domains, I can see a pattern 
where the C14N breaks show a common DKIM package.

For example:

Among the 10 c=simple, 6 of them are our wcDKIM field testing site 
domains.  Thats because of our default setting DKIM_SIGN_SIMPLE.

Among the 31 c=relaxed/simple, many of groups of domains are from the 
same organizations.  Like my wife signing up for food coupons one Red 
Lobster we can spam from the Corporation for there other franchises:

   c=relaxed/simple; d=news.longhornsteakhouse.com; s=yesmail1
   c=relaxed/simple; d=news.olivegarden.com; s=yesmail1;
   c=relaxed/simple; d=news.redlobster.com; s=yesmail1;

Among the 2 c=simple/relaxed, this appears to be the same organization 
based on the same selector:

     c=simple/relaxed;  d=bothan.net; s=2011.01.24;
     c=simple/relaxed;  d=drewhess.com; s=2011.01.24;

and finally among the largest 134 c=relaxed/relaxed, clearly you can 
see a district group because the patterns of the signing domain and 
similar or near similar selector and can see many phishing or appears 
to variations of signing domains, include groups where the only 
difference is .com, .net or .org.

Here are some selected examples

c=relaxed/relaxed;  d=couponba.com;       s=mail;
c=relaxed/relaxed;  d=couponble.com;      s=mail;
c=relaxed/relaxed;  d=couponsystems.net;  s=mail;
c=relaxed/relaxed;  d=couponsystems.org;  s=mail;

c=relaxed/relaxed;  d=mcsv178.net;        s=k1
c=relaxed/relaxed;  d=mcsv78.net;         s=k1
c=relaxed/relaxed;  d=mcsv83.net;         s=k1

c=relaxed/relaxed;  d=smartsavingclub.com; s=mail
c=relaxed/relaxed;  d=smartsavingnow.com;  s=mail

c=relaxed/relaxed;  d=smtpninja.com;       s=mail
c=relaxed/relaxed;  d=smtpninja.net;       s=mail

c=relaxed/relaxed;  d=smtpresults.com;     s=mail
c=relaxed/relaxed;  d=smtpresults.net;     s=mail

c=relaxed/relaxed;  d=selfsaver.net;       s=mail
c=relaxed/relaxed;  d=selfsaver.org;       s=mail

You can probably assume these 15 domains are really just 5 
organization/software and if we presume those with the same selector, 
its 1 organization using the same public key.

I will suggest that relaxed/relaxed is the largest because spammers, 
eMarketers  want to highest possible chance of surviving a header/body 
integrity check.  They don't want any increased complexity or change 
of having errors when it comes to C14N.

-- 
Hector Santos, CTO
http://www.santronics.com
http://santronics.blogspot.com


_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html