ietf-asrg
[Top] [All Lists]

[Asrg] 2. Analysis - real-time DNSBL accuracy figures

2003-11-07 11:02:33

Hi folks.

A few months back, there was a discussion of DNSBL accuracy; I posted some
SpamAssassin figures, based on our "mass-check" tests, but noted that they
were computed using *current* DNSBL contents against a corpus of *saved*
mail, so due to the time delta, were not 100% representative.

These figures are a lot better.   Since August, I've been collecting
real-time DNSBL hit data on my mail, as it is delivered at my SpamAssassin
installation.   In other words, it's *live* accuracy data -- it's using
just what the DNSBLs had listed at scan time.

Note, however, that it's still incomplete:

  - some DNSBLs were not measured; these are just the default DNSBL list
    in SpamAssassin 2.60, excluding RCVD_IN_NJABL_DIALUP (which I had
    to remove because I can't parse out accurate data).

  - it's only 1 person's hand-classified mail.

  - SpamAssassin tests more than just the "delivering" SMTP relay; it'll
    also look backwards through the headers, at earlier relays, to catch
    spam sent via mailing lists.

But the results should still be quite useful.

The time period covered:

  - Thu, 21 Aug 2003 17:11:30 -0700 (PDT)
  - Sat, 25 Oct 2003 23:11:52 -0700 (PDT)

Recap of the fields:

  - SPAM% = percentage of messages hit that were spam
  - HAM% = percentage of messages hit that were spam
  - S/O = Spam/Overall = Bayesian probability of spam
  - RANK = artificial ranking figure, ignore this!
  - SCORE = default SpamAssassin 2.60 score
  - NAME = name of test.  Figuring out the exactly DNSBL should be pretty
    obvious ;)

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
  21839     1993    19846    0.091   0.00    0.00  (all messages)
100.000   9.1259  90.8741    0.091   0.00    0.00  (all messages as %)
  5.989  59.0567   0.6601    0.989   1.00    2.25  RCVD_IN_BL_SPAMCOP_NET
  3.869  37.7822   0.4636    0.988   0.96    1.10  RCVD_IN_DSBL
  0.751   8.2288   0.0000    1.000   0.95    4.30  RCVD_IN_OPM_HTTP
  1.964  20.2709   0.1260    0.994   0.95    1.10  RCVD_IN_NJABL_PROXY
  0.659   7.1751   0.0050    0.999   0.95    0.64  RCVD_IN_NJABL_SPAM
  0.614   0.0000   0.6752    0.000   0.94   -0.10  RCVD_IN_BSP_OTHER
  0.050   0.5519   0.0000    1.000   0.94    4.30  RCVD_IN_OPM_SOCKS
  0.027   0.3011   0.0000    1.000   0.94    4.30  RCVD_IN_OPM_WINGATE
  0.119   0.0000   0.1310    0.000   0.94   -4.30  RCVD_IN_BSP_TRUSTED
  0.939   9.7341   0.0554    0.994   0.94    4.30  RCVD_IN_OPM
  1.081  10.9383   0.0907    0.992   0.93    1.52  RCVD_IN_SORBS_SOCKS
  1.062  10.7376   0.0907    0.992   0.93    1.27  RCVD_IN_SBL
  0.229   2.4084   0.0101    0.996   0.93    1.10  RCVD_IN_SORBS_MISC
  0.618   6.3221   0.0453    0.993   0.93    1.10  RCVD_IN_SORBS_HTTP
  0.595   5.9709   0.0554    0.991   0.92    4.30  RCVD_IN_OPM_HTTP_POST
  0.078   0.7526   0.0101    0.987   0.90    2.60  RCVD_IN_SORBS_ZOMBIE
  0.815   7.5263   0.1411    0.982   0.89    1.39  DNS_FROM_RFCI_DSN
  3.594  24.8369   1.4613    0.944   0.81    2.55  RCVD_IN_DYNABLOCK
  1.685  11.4400   0.7054    0.942   0.78    0.10  RCVD_IN_RFCI
  0.380   2.4586   0.1713    0.935   0.75    1.31  RCVD_IN_NJABL_RELAY
  6.182  33.9689   3.3911    0.909   0.73    0.10  RCVD_IN_NJABL
 10.422  44.4054   7.0090    0.864   0.63    0.10  RCVD_IN_SORBS
  0.037   0.1505   0.0252    0.857   0.54    2.80  RCVD_IN_SORBS_WEB
  2.344   4.1144   2.1667    0.655   0.17    0.00  RCVD_IN_SORBS_SPAM

--j.

_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg



<Prev in Thread] Current Thread [Next in Thread>
  • [Asrg] 2. Analysis - real-time DNSBL accuracy figures, Justin Mason <=