procmail
[Top] [All Lists]

Re: Score and _AND_

2002-10-08 15:18:08
From: Udi Mottelo <uuddii(_at_)eng(_dot_)tau(_dot_)ac(_dot_)il>

      I just wandering:  Suppose one wants to score three words and
      s/he wants to be sure that this three words are exist in the text.
      The only way that I can see is:

:0 B
* word1
* word2
* word3
{
      :0 Bfb
      * 1^1 word1
      * 1^1 word2
      * 1^1 word3
      | /do/something/with $=
}

      Now, we can be sure that  $= >= 3  and every word appearances
      at least one time i.e. if  $= == 5  then the word1 could not
      appearance 3 times.

      There is a big deficiency in this recipe - procmail pass twice
      on the data.  Any idea?

Let me get this straight: she wants the exact count of each word and
to ensure that each is there at least once.  Hmm.  Okay, here's an idea:

        :0 B
        * word1
        * word2
        * word3
        * 1^1 (word1|word2|word3)
        { total = $= }

But I know what you're saying: procmail still looks twice, even
though it's now only one recipe.  Okay, how about assigning to
each of the words different incremental spreads?  Is there
an anticipated maximum number of times the words will appear?
Say, under 10 times each?  If under ten, you could assign increments
of 1 to word1, 10 to word2, and 100 to word3.  A final score of
564 means word1 appeared five times, word2 six times, and word3 four.
Or increment word1 by 1, word2 by 1000, and word3 by 100000.

-- 
dman

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>