ietf-dkim
[Top] [All Lists]

Re: [ietf-dkim] Some very early implementation report details from OpenDKIM

2010-08-04 13:43:31
Yes, we could do graphing.  As I said, this is very preliminary and I just 
wanted to get it into a simple form to show what we can do so far. I have yet 
to avail myself of any tools beyond the basic SQL client.  Anyone with 
experience making SQL data extra-pretty who's also willing to put the time into 
it is welcome to contact me for access off-list.

We can look at breaking out the "l=" value and tracking signature syntax errors 
in the next release.  The current data only includes key syntax errors.

From: Tony Hansen [mailto:tony(_at_)att(_dot_)com]
Sent: Wednesday, August 04, 2010 11:32 AM
To: Murray S. Kucherawy
Cc: ietf-dkim(_at_)mipassoc(_dot_)org; 
opendkim-users(_at_)lists(_dot_)opendkim(_dot_)org
Subject: Re: [ietf-dkim] Some very early implementation report details from 
OpenDKIM

Very interesting data.

Too bad all of the domains with 100% failure rate are *all* hashed.

I'm surprised that failed(body) is zero. I would have expected that to fail 
more often due to mailing list modifications.

Some possible enhancements:
 *) It would be interesting seeing some of this data graphed against time.
 *) Of the l= uses, how many were l=0 vs. l=some-other-value?
 *) Can differentiation be made between syntax errors in the DNS entry and 
syntax errors in the signature?

    Tony Hansen

On 8/4/2010 2:00 PM, Murray S. Kucherawy wrote:
We've started gathering data from a few of our installations that have chosen 
to submit it to us.  With only four sources reporting, we can already see some 
interesting pieces of information.

A report is generated based on our accumulated data every half hour at 
http://www.opendkim.org/stats/report.html.

First, some explanation, as the reports are currently somewhat crude:


Each record in the database represents a single received message.

In the signature algorithm table, "0" is rsa-sha1, "1" is rsa-sha256.

In the two canonicalization tables, "0" is simple, "1" is relaxed.

In the pass/fail rate tables, "failed(body)" indicates a message where "bh" 
changed between the signer and the verifier.

Data submitters are given the option to anonymize their data.  This is done by 
MD5-ing the From: domain and the submitting IP address, allowing aggregation of 
data on common sources but only limited reverse-engineering of it.  This is why 
the domain names in some cases are hashes and not real data.

Mailing list traffic is detected by identifying List-* header fields or a 
"Precedence: list" header field.  If people have additional ways to suggest 
identifying list traffic, please let me know.

ADSP "passed" currently includes things with valid author domain signatures, 
for which ADSP is actually not checked.  This will be broken out in our next 
release.

The very interesting things to note so far:


"relaxed" is the most popular header canonicalization, but I think we expected 
that.  "relaxed" is also the most popular body canonicalization, which is not 
the general advice we give, though I suspect this is skewed by the fact that 
that's what gmail.com uses.

Almost 90% of DKIM signatures survive, unless they go through lists in which 
case the success rate plunges to 32%.

Just under half of all signed mail passes through five hops total (some of 
which may be pre-signature).

Most DKIM signatures pass as long as they go through three or fewer hops.  
After that, survivability drops dramatically.

Not a single signature has failed as a result of body changes (apart from what 
the canonicalizations tolerate).

Third-party signatures appear to have a much higher failure rate than author 
signatures.

Upcoming revisions to our collection mechanisms include:


Tracking use of "g=" in keys.

More detailed analysis of ADSP.

Tracking of DNSSEC use with respect to DKIM keys.

Ability to produce reports for each reporting site rather than only 
aggregation.  (We can do that now but because of our current schema, it's 
expensive.)

Ability to exclude anonymized data from certain reports.

When "z=" tags are used, identification of which fields are being changed in 
transit.

We need more data!  OpenDKIM users are encouraged to enable the statistics code 
and participate in the program (though, of course, you are under no obligation 
to do so).  Instructions were sent to the opendkim-users list on July 30th, as 
well as information already available in the stats/README file in the source 
distribution.

Feedback from both groups is welcome.

-MSK






_______________________________________________

NOTE WELL: This list operates according to

http://mipassoc.org/dkim/ietf-list-rules.html


_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html
<Prev in Thread] Current Thread [Next in Thread>