Re: DKIM: Canonicalization


----- Original Message ----

Ok, I'm just kind of throwing this idea out.  It has been well over a
year since I last looked into the hashing systems used by DCC/Razor
and I have not seen any of the design work that went into the nowsp
method, so maybe this has already been done.


Though I don't know for sure, I'd guess that both of these schemes would try to 
ignore non-human readable message body text.   If they don't, then they 
probably lost their effectiveness in 2002 or 2003. The spammers have spent much 
time in thinking about cool ways to insert hashbusting goop into a message 
without alerting the user:
 - text in zero point font
 - white font on white background
 - $color font on $color background
 - $color font on $similar_color background
 - text in invalid html tags
 - text in html comments
 - text in a 1x1 invisible gif's alt tag
 etc.  Also, I'd similarly guess that these algorithms do some sort of 
sub-document-level processing -- a la shingling or fuzzy hashing, where only 
parts of documents are chosen to perform a match.
 
 I do think it clever to think about re-using proven technology here though, 
unless there is going to be a real test of diffenet canonicalization schemes.
 
 miles

<Prev in Thread]	Current Thread	[Next in Thread>
Re: DKIM: Canonicalization, Miles Libbey <=

Previous by Date:	Re: MASS Proposals Comparison Matrix, william(at)elan.net
Next by Date:	Re: DKIM: c=simple is aspirational, Jim Fenton
Previous by Thread:	MASS Proposals Comparison Matrix, william(at)elan.net
Next by Thread:	RE: DKIM - Version, Hallam-Baker, Phillip
Indexes:	[Date] [Thread] [Top] [All Lists]