procmail
[Top] [All Lists]

count of words in big letters?

1997-12-17 00:44:03

    Hi,

    I'm trying to count big letter words in the message body, but
    I'm unable to contruct the score recipe right. Say, that
    I tolerate 3 big letter words, and if there is more, then
    I consider it UBE. The regexp should ignore some words like:
    SMTP, AM, IP, base64-decoded-lines.

    I started with simple word count, but it doesn't work. 
    The regexp is supposed to
    - start at word border
    - must have at least 3 big letters
    - have trailing space


    max = 3

    #   Count capitalized words
    :0 D
    *$      -$max^0
    *$ B ?? 1^0 ()\<[A-Z][A-Z][A-Z]+[ ]
    {
        count       = $=
        dummy       = "$count capitalized words"
    }


body example:


------ =_NextPart_000_01BD0A3F.2D0B88F0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

-----Alkuper=E4inen viesti-----
L=E4hett=E4j=E4:    Jari Aalto 
[SMTP:jari(_dot_)aalto(_at_)ntc(_dot_)nokia(_dot_)com]
L=E4hetetty:    Tuesday, December 16, 1997 11:04 AM
Vastaanottaja:  xx xx
Aihe:   xxx

TAMAN PAIVAN OSALTA ALKAA PROJEKTITEHTAILU OLLA VAIHTEEKSI KASASSA. =
txt txt txt ...

------ =_NextPart_000_01BD0A3F.2D0B88F0
Content-Type: application/ms-word
Content-Transfer-Encoding: base64

eJ8+IgkOAQaQCAAEAAAAAAABAAEAAQeQBgAIAAAA5AQAAAAAAADoAAEIgAcAGAAAAElQTS5NaWNy
b3NvZnQgTWFpbC5Ob3RlADEIAQ2ABAACAAAAAgACAAEEkAYAyAEAAAEAAAAQAAAAAwAAMAIAAAAL

<Prev in Thread] Current Thread [Next in Thread>