ietf-clear
[Top] [All Lists]

[ietf-clear] HELO length statistics

2004-12-02 05:27:40
If we augment CSV to allow for recipients to search for CSA SRV records,
how many DNS queries would be required? I've done an analysis on my MX
hosts' logs for the last month. I've extracted the HELO names that have
been used and uniquified them.

Counting all offered messages (rejected or not), we saw 1 447 252
different HELO names. If I count the number of dots in each name, the
resulting histogram is as follows. The small end (0-2 dots) is inflated by
incompetence and forgery. The big end (>10 dots) is 99.99% abuse.

 25765
450511 .
218188 ..
432343 ...
197647 ....
 33647 ..... 5
 28485 ......
 19790 .......
  4582 ........
  2040 .........
  3069 .......... 10
  7005 ...........
  9483 ............
  7722 .............
  4390 ..............
  1840 ............... 15
   568 ................
   150 .................
    23 ..................
     3 ...................
     1 .................... 20

Of the messages we accept, 274 902 different HELO names were used (19% of
the total). If I count the number of dots in each name, the resulting
histogram looks like this:

 5723
69182 .
84906 ..
75131 ...
26182 ....
 4723 ..... 5
 4436 ......
 2686 .......
  279 ........
  123 .........
  123 .......... 10
  317 ...........
  447 ............
  320 .............
  211 ..............
   87 ............... 15
   21 ................
    4 .................
    1 ..................

A lot of these are clearly bogus, for example 80 characters of random
words concatenated with an IP address, like

Antigone.meter.ernet.ne.jpsouthparkmail.comnetlane.comlouiskoo.comjpopmail.comtw60.186.213.104

or a random collection of concatenated domain names, like

cave.ngs.ouse.hello.nlsammail.compcmail.com.twsouthparkmail.com

(These should obviously be added to my HELO heuristics!) After removing
them, there are 272 890 HELO names. If I count the number of dots in each
name, the resulting histogram looks like this:

 5723
69182 .
84905 ..
75130 ...
26176 ....
 4688 ..... 5
 4334 ......
 2521 .......
  179 ........
   47 .........
    0 .......... 10
    2 ...........
    3 ............

This still includes various stupidities. 26631 of the 37272 single dot
domains ending in com|net|org have no name servers so are invalid. Of the
unfiltered list, 208323 of the 288884 com|net|org domains are invalid.

So to conclude, it looks like anything more than 12 lookups is not
necessary for real names. (Many of the longer ones are DSL, so another
useful HELO rejection heuristic might be to reject names with 8 or more
dots.) For those interested, the longest 52 of the remaining names are as
follows.

0.1.5.2.8.6.9.5.1.4.1.tpc.int
0.8.8.5.3.4.5.2.2.5.8.tpc.int
1.1.3.3.3.7.5.2.2.5.8.tpc.int
200-127-113-221.dsl.prima.net.ar.113.127.200.in-addr.arpa
200-127-115-63.dsl.prima.net.ar.115.127.200.in-addr.arpa
200-127-116-156.dsl.prima.net.ar.116.127.200.in-addr.arpa
200-127-116-215.dsl.prima.net.ar.116.127.200.in-addr.arpa
200-127-116-224.dsl.prima.net.ar.116.127.200.in-addr.arpa
200-127-116-40.dsl.prima.net.ar.116.127.200.in-addr.arpa
200-127-116-6.dsl.prima.net.ar.116.127.200.in-addr.arpa
200-127-117-163.dsl.prima.net.ar.117.127.200.in-addr.arpa
200-127-117-23.dsl.prima.net.ar.117.127.200.in-addr.arpa
200-127-117-251.dsl.prima.net.ar.117.127.200.in-addr.arpa
200-127-120-122.dsl.prima.net.ar.120.127.200.in-addr.arpa
200-127-122-187.dsl.prima.net.ar.122.127.200.in-addr.arpa
200-127-122-83.dsl.prima.net.ar.122.127.200.in-addr.arpa
200-127-123-103.dsl.prima.net.ar.123.127.200.in-addr.arpa
200-127-123-106.dsl.prima.net.ar.123.127.200.in-addr.arpa
200-127-123-49.dsl.prima.net.ar.123.127.200.in-addr.arpa
200-127-124-1.dsl.prima.net.ar.124.127.200.in-addr.arpa
200-127-124-198.dsl.prima.net.ar.124.127.200.in-addr.arpa
200-127-125-247.dsl.prima.net.ar.125.127.200.in-addr.arpa
200-127-126-27.dsl.prima.net.ar.126.127.200.in-addr.arpa
200-127-127-239.dsl.prima.net.ar.127.127.200.in-addr.arpa
200-127-127-73.dsl.prima.net.ar.127.127.200.in-addr.arpa
228-121.dothan.cable.graceba.net.227.203.66.in-addr.arpa
228-42.dothan.cable.graceba.net.227.203.66.in-addr.arpa
2315.bsb.virtua.com.br.231.167.200.in-addr.arpa
66.169.100.122.ts46v-01.mhe1.ftwrth.tx.charter.com
66.169.120.66.ts46v-05.otnb1.ftwrth.tx.charter.com
66.169.192.38.ts46v-19.pkcty.ftwrth.tx.charter.com
66.169.198.29.ts46v-19.pkcty.ftwrth.tx.charter.com
66.190.65.164.ts46v-01.mhe1.ftwrth.tx.charter.com
68.113.217.52.ts46v-11.otne1.ftwrth.tx.charter.com
68.184.184.90.ts46v-20.wthfrd.ftwrth.tx.charter.com
68.187.44.59.ts46v-02.mhe2.ftwrth.tx.charter.com
68.187.45.30.ts46v-03.otna1.ftwrth.tx.charter.com
68.187.45.54.ts46v-03.otna1.ftwrth.tx.charter.com
Alameda.net.has.not.owned.this.ip.for.more.then.four.years
alameda.net.has.not.owned.this.ip.for.more.then.four.years
functions.65.s65.apx1.trpr.pa.dialup.rcn.comcocolee.net
ip.82.144.200.247.dyn.pool-2.broadband.voliacable.com
ip.82.144.201.10.dyn.pool-2.broadband.voliacable.com
ip.82.144.202.1.dyn.pool-2.broadband.voliacable.com
ip.82.144.204.102.dyn.pool-2.broadband.voliacable.com
ip.82.144.204.175.dyn.pool-2.broadband.voliacable.com
ip.82.144.204.6.dyn.pool-2.broadband.voliacable.com
ip.82.144.206.216.dyn.pool-2.broadband.voliacable.com
ip.82.144.207.238.dyn.pool-2.broadband.voliacable.com
ip.82.144.213.29.dyn.pool-1.broadband.voliacable.com
server3.dtr.com.br.0-63.15.245.200.in-addr.arpa
wbar16.dal1-4.14.167.168.dal1.elnk.dsl.genuity.net

Tony.
-- 
f.a.n.finch  <dot(_at_)dotat(_dot_)at>  http://dotat.at/
MALIN HEBRIDES: NORTHEAST 4 OR 5 INCREASING 6. RAIN LATER. GOOD BECOMING
MODERATE.