procmail
[Top] [All Lists]

Re: Undesirable characters in subject text

2003-05-29 10:52:27
On Tue, 27 May 2003 18:43:59 -0700, PSE-L(_at_)mail(_dot_)professional(_dot_)org
(Professional Software Engineering) wrote:
=> The implication was that if your problem was unix wildcards in the string, 
=> doublequoting it would resolve the specific problem you were reporting:

        I think I missed the "implied" part the first time I read
it.  It was, in fact, the solution to my problem.
 
=> Perhaps offering up that tidbit misled you to believe I 
=> was saying removal of the echo-provided linefeed was my proferred solution 
=> to your problem, which isn't the case - the doublequotes are.

        Indeed.

=> If this were something big, in a production environment, the wise 
=> programmer might consider simply writing a daemon which uses a named 
=> pipe.

        Well, here I would get lost with this "named pipe" stuff.
I'd probably name it Erik the Pipe, but then perhaps it should be
Erik the Pike, which would then of course probably be an alias
for Erik the Fish, and then I'd have to program it in Python ...
<sigh> I told you I'd get lost.

=>  This would resolve several things: procmail wouldn't invoke anything 
=> - it'd simply output to a file (which, because it is a named file, would 
=> never actually be written to disk, though your lockfile would be); no 
=> commandline invocation of anything, so no issues with wildcard, hibit, or 
=> control characters; no shell invocations; no additional process being fired 
=> up each time a message comes through (beyond all the processes already 
=> invoked as a matter of course in your system) - the daemon monitoring the 
=> named pipe is simply sitting there always running.  Since you'd have a 
=> lockfile on its input, only one process at a time would be writing to it, 
=> so it handily keeps the input serialized.

        You know I think I actually understood that. However,
daemons are way beyond me at this point. My devil is much closer
in the details.
 
=> A certain beauty exists with the named pipe:
        <snip>
=> The only real risk with a named pipe is
        <snip>
=> Just as a regular file would, the named pipe actually has 
        <snip>

        Thanks, I've saved all that for later.
 
=> Well, then you're going to need to meet us halfway and at a minimum, 
=> present the character class which you either want to retain, or which you 
=> want to discard or translate.

        I thought I had, sorry.

=> To delete, say, all control chars, and all hibit chars:
=> | echo "$SUBJECT" | tr -d "[\000-\037][\200-\377]" >> somefile
=> Or, to translate the characters to some other symbol:
=> | echo "$SUBJECT" | tr "[\000-\037][\200-\377]" "%" >> somefile

        Thank you. Being *nix impared is a problem sometimes and
code suggestions really help point the way to continue looking.

=> If this invocation actually proves to be too much of a burden on your host, 
=> a CPU upgrade may be in order.

        Not an option at this point. I'm stuck with the iron I've
already got in place.

=> Ah, the "dbase" file you refer to isn't the live database, it is merely an 
=> interrim file?

        Yup, just a "mailbox" for another process which runs by
schedule to process whatever it finds there.  This reduces the
overhead of opening/closing/reading/writing to the actual MySQL
database. Once every 15 or 30 mins or whatever is just fine and
allows me to do other stuff then to help with the overall
analysis.

=> Why isn't the data submitted directly to the live SQL 
=> db?

        Seemed a better design to me to unlink the db update from
the procmail process. I think I'll stay with that design now that
the surrounding extra quotes fixed my problem. It feels good as
it's currently working. I like simple.

        I must say that once I decided to extract everything into
variables at the start and then just work with variables as
you've suggested/done, everything else fell into place much
better.  Also allowed me to write it all in sub-routine-ized code
shared by a number of domains, all of which have slightly
different "rules" set by switch type variables.

        BTW: looking at the first "rushes" from my dbase files
has given me a much better look at what that torrent of spam
actually looks like and where it comes from. Amazing patterns!

        Thanks again for all the help,

        - Don

_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail