ietf-dkim
[Top] [All Lists]

Re: [ietf-dkim] Bayesian filters are the pits

2006-08-25 00:49:11

----- Original Message -----
From: "Dave Crocker" <dhc(_at_)dcrocker(_dot_)net>
To: "J.D. Falk" <jdfalk(_at_)yahoo-inc(_dot_)com>

J.D. Falk wrote:
If anyone can turn this into a short (but still accurate) slogan, I'll
arrange for a small run of t-shirts.



      DKIM:

      Signature == Good/Bad
      Signature + Assessment == Good


Dave,  I had to be anal on this, but unless you are saying DKIM IS NOT the
above, I think might want to rethink this one :-)

If you want the t-shirt to say the above, it implies "assessment"
disseminates the bad from the total population of good and bad which is what
I've been advocating with SSP.

Proof:

    signature = {good, bad}
    signature + assessment = {good}

Resolving for assessment:

    assessment = {good} - signature
    assessment = {good} - {good,bad}
    assessment = {good} - {good} - {bad}
    assessment = -{bad}

which implies that your idea for Assessment focuses on subtracting the bad
from the total population of good and bad, and what remains is the good. <g>

So unless you have a database of the bad, you won't know what is bad in
order to filter it.  You are left with a deterministic method of detecting
the bad which is what SSP is about.

If you want your assessment to focus on looking for the good, then your
model implies the "Bad is the anti-good" or "Good is the opposite of bad" or
"Good is the negation of the bad" or modeled as:

    {good} = -{bad}   <--- Good is the anti-bad!

Proof:

    signature = {good, bad}
    signature + assessment = {good}

Assume:

    Assessment = {good}   <--- Database of the good!

Substitute:

    {good, bad} + {good}  = {good}

Therefore, in order for this to be correct, you must have:

    {good} = -{bad}   <--- Good is the anti-bad!

Which says that if you are not in the GOOD database, then you must be a BAD
person!

Of course, I don't believe that is not really what you seek. So
alternatively, to make any sense of it, it could be modeled as such:

    {good, bad} + {good}  = {good*2, bad}

which says that Assessment is using a database of the good in order to add
weight to the population of what is acceptable.

The problem?

You still are passing on the {bad} population.  Your idea for assessment has
not done anything to address the bad.

Hence, you can view this as a simple neural net:

Weight of 1.5 you have a AND gate:

    {good, bad} AND {good}  = {good}

Weight of .5 you have an OR gate:

    {good, bad} OR {good}  = {good*2, bad}

Which of course, implies that the more information you have on the good, you
can use an AND condition to eliminate the noise (bad).   But with less
information, you would need to use an OR condition to order to help built up
the weights.


--
Hector Santos, Santronics Software, Inc.
http://www.santronics.com








_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html