Re: Undesirable characters in subject text

At 15:41 2003-05-27 -0700, procmail(_at_)deliberate(_dot_)net wrote:

        Actually I did, thanks. I really do thoroughly read and
try to understand most of your posts here, Sean. Despite your
gruff nature


I think of it more as MOF - Matter Of Fact.

The implication was that if your problem was unix wildcards in the string,doublequoting it would resolve the specific problem you were reporting:


| echo "$SUBJECT" >> somefile

Seems to work just fine even when there is a unpaired quote within thestring (even of the same type that you're enclosing the string in).

BTW, if your db is comma delimited, you'll likely need to contend withquoting strings which contain commas, and where there may be quotes,ensuring that they're not blasted by that same process.

=> You may want to use:
=> | echo -n "$SUBJECT" >> somefile
=> Which would omit the trailing newline which echo would otherwise tack onto
=> the emitted text.

        It's only the "odd" ones that I wanted to suppress, the
ones that are the result of the unescaped wildcards.

Once again, if you enclose it in quotes as shown, that seems to beresolved. I still don't follow what about that solution isn't working foryou, which is what led me to believe you had not invoked the examples provided.

The suggestion to use -n was if you were building a db file where you mightreally want each field of the record to appear on a single line. if thatisn't a concern for you, don't use the -n. I did include the explanationfor the variation. Perhaps offering up that tidbit misled you to believe Iwas saying removal of the echo-provided linefeed was my proferred solutionto your problem, which isn't the case - the doublequotes are.

        Guess I'm back to square one.  I don't know sed and I'm
not sure I want to incur the costs of the call.

The manpage for 'tr' is pretty straightforward, and for what you're likelyto need - a simple character deletion or translation (versus stringoperations, inclusive of regexp and the like, for which you'd use the muchlarger sed).

If this were something big, in a production environment, the wiseprogrammer might consider simply writing a daemon which uses a namedpipe. This would resolve several things: procmail wouldn't invoke anything- it'd simply output to a file (which, because it is a named file, wouldnever actually be written to disk, though your lockfile would be); nocommandline invocation of anything, so no issues with wildcard, hibit, orcontrol characters; no shell invocations; no additional process being firedup each time a message comes through (beyond all the processes alreadyinvoked as a matter of course in your system) - the daemon monitoring thenamed pipe is simply sitting there always running. Since you'd have alockfile on its input, only one process at a time would be writing to it,so it handily keeps the input serialized.

For those who are blissfully unaware, named pipes are exactly how onecreates a dynamic .sigfile - you run a program which creates a named pipeof ~/.signature, and each time that named pipe is read from, it writes anew signature (fortune, whatever) to it.

A certain beauty exists with the named pipe: one of the two programs in theequation (in this case, procmail) doesn't need to know that it's a namedpipe - to that program it's simply a file - absolutely nothing special isperformed by the process which doesn't control the pipe.

The only real risk with a named pipe is if the monitoring process dies:either the pipe isn't properly deleted, and so processes continue toattempt to write to something which isn't being read), or the pipe goesaway, and the processes are now creating and writing to an actualfile. Properly written, there shouldn't be much of a risk that the daemonwill spontaniously die. The daemon could catch signals and properly closethe named pipe, and you could have a cron task that periodically restartsthe daemon as necessary, and when restarted, before creating the namedpipe, it could see if a physical file exists and process the contents ofthat as if they were written to the pipe (when done, remove the file andopen a pipe in its place). Chances are, most people don't bother with theextra precautions.

Just as a regular file would, the named pipe actually has an owner and filepermissions, allowing the creator to limit who can write to it - indeed,processes owned by other users on the system could be permitted to write to it.

The technique is also useful for allowing listserv programs to tack on astandardized list footer, but with a variable web password (that say,cycles every 24 or 48 hours) for limiting access to the list archives tolist subscribers only.

BTW, the mkfifo mechanism can be used to manage files used with anINCLUDERC in procmail (i.e. some program which reads a db of some natureand emits procmail code). Procmail doesn't invoke any special program - itsimply opens the file that is being INCLUDERC'd. As changes are made tothe db which generates the RCFILE (in memory, not on disk), the file isdynamically regenerated.

Also if I must, I'd rather be lazy and try a cook-book sed example given by
another than try to stretch my brain around more *nix stuff.

Well, then you're going to need to meet us halfway and at a minimum,present the character class which you either want to retain, or which youwant to discard or translate. If you're translating, you'll need to definewhat you want disallowed symbols translated to.


To delete, say, all control chars, and all hibit chars:

| echo "$SUBJECT" | tr -d "[\000-\037][\200-\377]" >> somefile

Or, to translate the characters to some other symbol:

| echo "$SUBJECT" | tr "[\000-\037][\200-\377]" "%" >> somefile

If this invocation actually proves to be too much of a burden on your host,a CPU upgrade may be in order. Sadly, tr doesn't support having theoriginal text to be translated provided to it by any means other than itsstdin, so you still have to pipe from echo.

If you use 'time' a large saved mailbox, and a sandbox config, you couldget an idea as to the CPU overhead for the additional processes (bear inmind that you'll have cacheing going on)

        Of course, the easiest solution might be to make the
program which periodically reads the file and writes it to the
MySQL database a bit more intellegent when it comes to handling
each line - as long as the wildcard problem is limited to an
extraneous embedded linefeed.

Ah, the "dbase" file you refer to isn't the live database, it is merely aninterrim file? Why isn't the data submitted directly to the live SQLdb? If you did that in realtime, the CPU cycles (as well as intermediatedisk writes) you would save would probably make up for any added overheadof making the initial submitter a bit more intelligent, and you'd have justone helper app - the app which takes the provided data and outputs it tothe SQL database.


---
 Sean B. Straw / Professional Software Engineering

 Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.


_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail