procmail
[Top] [All Lists]

Undesirable characters in subject text

2003-05-27 11:58:41

        I've been getting several spam a day which seem to have
some kind of embedded linefeeds in the subject line.  This has
become a problem for me as I'm now tracking this data not only to
a detailed log file but also to a text file for transfer to a
database for analysis.

        The line breaks *sometimes* but not always mess up the
log format but they definitely mess up the single line CSV
records for my txt file.  The subject lines tend to look innocent
enough like:   Subject: * * You're Approved. * *
but the first two asterisks in this case are something else
apparently when writing them out to a file.

        I prefer to stay within procmail rather than shell out
and so have been trying to use recipes setup such as these:

:0
* $ ^Subject: ${WHITESPACE}\/.*[-a-z0-9$!?%.]+
 { SUBJECT=$MATCH }
 
 :0
 * SUBJECT ?? ()\/[ !#-_a-~]+
 { DBASE_SUBJECT = $MATCH }

        I tried other ways to little success and now I guess my
#-_ and a-~ in the character class are not working the way I
thought they would.

        I use this to write out an assembled dbase type record:
QMARK = '"'
TEMP = "${QMARK},${QMARK}"
DBASE_REC = ${QMARK}${UNIQUE_ID}\
        <snip snip snip>
${TEMP}${DBASE_SUBJECT}\
${QMARK}
:0 hic:
| echo  ${DBASE_REC} >> ${DBASE_FILE}

        The assembly and writing works well with the exception of
the strange linefeed. Does anyone have a reasonable way within
procmail to strip out these characters or, in the alternative,
capture only up to the "bad" characters for my DBASE_SUBJECT?

        A haunting thought is that somehow the asterisks are
legal but inducing some strange activity when I echo the file
record????

        TIA,

        - Don

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail

<Prev in Thread] Current Thread [Next in Thread>