procmail
[Top] [All Lists]

Re: formail -D & using hashcodes instead?

1996-05-19 11:36:33

-----BEGIN PGP SIGNED MESSAGE-----

From: turnbull(_at_)turnbull(_dot_)sk(_dot_)tsukuba(_dot_)ac(_dot_)jp 
(Stephen J. Turnbull)
Date: Sun  May 19, 10:35pm

"Karl" == Karl E Vogel 
<vogelke(_at_)c17mis(_dot_)wpafb(_dot_)af(_dot_)mil> writes:

    >>> Robert <dummy(_at_)c2(_dot_)org> writes:
    R> Has anyone considered the thought of using a hash function on
    R> the body of a message instead of merely using the Message-ID
    R> field for "formail - -D"?

    Karl>    A few years ago, the same folks who wrote "agrep" and
    Karl> "glimpse" presented a Usenix paper on a program called "sif"
    Karl> (similar files).

Robert did say "identical", so maybe if he can be more specific about
what "identical" means we could come up with a suggestion more
efficient than this general-purpose sif.  (A better name would be
"sift", "SImilar File Tagger" ;-)

I suggest that Robert arrange that his spam-hunting recipe
automatically generate a complaint to the correspondents who are
consistently sending repeats, and get them to arrange a mailing list
instead sending dupes ... that would make the computation worth it!

By "identical" I mean "exactly the same, character for character in the
body of the message".  That is, a "cmp" of files containing the bodies
would return status of 0.

And this isn't a spam specifically that I'm preventing myself from.  It
happens that whoever maintains the CPSR (Computer Professionals for
Society Responsibility) mailing lists sends out the same message in two
different mailing lists.  It's just a waste having to wade through messages
that are precisely the same, except for the header.

Regardless, I'll try to look for this "sif" thing.

- -- 
Internet: dummy(_at_)c2(_dot_)org                   In real life: Robert Brown,
URL: http://www.c2.org/~dummy                          in sunny Berkeley, CA
            >> Embargo China, not Cuba. <<             waiting for The Big One
  >> Ignorance isn't bliss -- it's good business. <<   (510) 464-4604


-----BEGIN PGP SIGNATURE-----
Version: 2.6.2

iQB1AwUBMZ9TADZMMbnCKCB9AQHjIAMAh3pXlyAZihUd8PLSKR02x7FAtlPwFlqx
EpTJRdQ8jerg7j2pBJe4h7q9n4G5L2qCt49FCNzzxGzlbgvfmi9nWyNvs4rUbkxD
2GpWVj6kYrkxf5kgJ6TuChdRI4A/hc5y
=TtuG
-----END PGP SIGNATURE-----

<Prev in Thread] Current Thread [Next in Thread>