Brad Knowles wrote:
At 11:56 AM -0400 2003/08/13, Paul Judge wrote:
2. As you mentioned, with blacklists you need the list of IP
addresses. The
problem is that the list of IP addresses in the headers will often
include
IPs of internal mail servers that organizations do not wish to
reveal. So,
you often have to reduce this to the set of IP addresses that come
before
the recipient's organization in order to make this data public.
For larger organizations, you may pass through multiple different
network blocks. I submit that it won't be programmatically possible to
detect and eliminate all of them. IMO, the best you can hope to do is
to avoid the last hop in the "Received:" headers, and anything else on
that same network.
A person in a position to deliver a large HAM corpus should be able to
do MUCH better than that. For example, have a listing of your inward
email gateways, and programmatically pull out just the IPs of the
machines talking to your inward gateways (source IPs in the Received
lines inserted by the gateways).
We have it really easy. Our inward gateways insert a X-SMTP-PEER-INFO
header with IP (and rDNS if available).
[In case it's not obvious, I'm not a fan of having inward filters block
on previous Received lines. Just on the address of the machine talking
to us.]
As a much bigger issue, the mangling that MOST client software does on
forwards makes any sort of analysis on email that's been through a
client very difficult. We do virtually no analysis of human-forwarded
email (other than looking for the X-SMTP-PEER-INFO header if they've
managed to include it). We tried more. Gave up.
_______________________________________________
Asrg mailing list
Asrg(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/asrg