Re: Undesirable characters in subject text
2003-05-27 19:16:39
At 15:41 2003-05-27 -0700, procmail(_at_)deliberate(_dot_)net wrote:
Actually I did, thanks. I really do thoroughly read and
try to understand most of your posts here, Sean. Despite your
gruff nature
I think of it more as MOF - Matter Of Fact.
The implication was that if your problem was unix wildcards in the string,
doublequoting it would resolve the specific problem you were reporting:
| echo "$SUBJECT" >> somefile
Seems to work just fine even when there is a unpaired quote within the
string (even of the same type that you're enclosing the string in).
BTW, if your db is comma delimited, you'll likely need to contend with
quoting strings which contain commas, and where there may be quotes,
ensuring that they're not blasted by that same process.
=> You may want to use:
=> | echo -n "$SUBJECT" >> somefile
=> Which would omit the trailing newline which echo would otherwise tack onto
=> the emitted text.
It's only the "odd" ones that I wanted to suppress, the
ones that are the result of the unescaped wildcards.
Once again, if you enclose it in quotes as shown, that seems to be
resolved. I still don't follow what about that solution isn't working for
you, which is what led me to believe you had not invoked the examples provided.
The suggestion to use -n was if you were building a db file where you might
really want each field of the record to appear on a single line. if that
isn't a concern for you, don't use the -n. I did include the explanation
for the variation. Perhaps offering up that tidbit misled you to believe I
was saying removal of the echo-provided linefeed was my proferred solution
to your problem, which isn't the case - the doublequotes are.
Guess I'm back to square one. I don't know sed and I'm
not sure I want to incur the costs of the call.
The manpage for 'tr' is pretty straightforward, and for what you're likely
to need - a simple character deletion or translation (versus string
operations, inclusive of regexp and the like, for which you'd use the much
larger sed).
If this were something big, in a production environment, the wise
programmer might consider simply writing a daemon which uses a named
pipe. This would resolve several things: procmail wouldn't invoke anything
- it'd simply output to a file (which, because it is a named file, would
never actually be written to disk, though your lockfile would be); no
commandline invocation of anything, so no issues with wildcard, hibit, or
control characters; no shell invocations; no additional process being fired
up each time a message comes through (beyond all the processes already
invoked as a matter of course in your system) - the daemon monitoring the
named pipe is simply sitting there always running. Since you'd have a
lockfile on its input, only one process at a time would be writing to it,
so it handily keeps the input serialized.
For those who are blissfully unaware, named pipes are exactly how one
creates a dynamic .sigfile - you run a program which creates a named pipe
of ~/.signature, and each time that named pipe is read from, it writes a
new signature (fortune, whatever) to it.
A certain beauty exists with the named pipe: one of the two programs in the
equation (in this case, procmail) doesn't need to know that it's a named
pipe - to that program it's simply a file - absolutely nothing special is
performed by the process which doesn't control the pipe.
The only real risk with a named pipe is if the monitoring process dies:
either the pipe isn't properly deleted, and so processes continue to
attempt to write to something which isn't being read), or the pipe goes
away, and the processes are now creating and writing to an actual
file. Properly written, there shouldn't be much of a risk that the daemon
will spontaniously die. The daemon could catch signals and properly close
the named pipe, and you could have a cron task that periodically restarts
the daemon as necessary, and when restarted, before creating the named
pipe, it could see if a physical file exists and process the contents of
that as if they were written to the pipe (when done, remove the file and
open a pipe in its place). Chances are, most people don't bother with the
extra precautions.
Just as a regular file would, the named pipe actually has an owner and file
permissions, allowing the creator to limit who can write to it - indeed,
processes owned by other users on the system could be permitted to write to it.
The technique is also useful for allowing listserv programs to tack on a
standardized list footer, but with a variable web password (that say,
cycles every 24 or 48 hours) for limiting access to the list archives to
list subscribers only.
BTW, the mkfifo mechanism can be used to manage files used with an
INCLUDERC in procmail (i.e. some program which reads a db of some nature
and emits procmail code). Procmail doesn't invoke any special program - it
simply opens the file that is being INCLUDERC'd. As changes are made to
the db which generates the RCFILE (in memory, not on disk), the file is
dynamically regenerated.
Also if I must, I'd rather be lazy and try a cook-book sed example given by
another than try to stretch my brain around more *nix stuff.
Well, then you're going to need to meet us halfway and at a minimum,
present the character class which you either want to retain, or which you
want to discard or translate. If you're translating, you'll need to define
what you want disallowed symbols translated to.
To delete, say, all control chars, and all hibit chars:
| echo "$SUBJECT" | tr -d "[\000-\037][\200-\377]" >> somefile
Or, to translate the characters to some other symbol:
| echo "$SUBJECT" | tr "[\000-\037][\200-\377]" "%" >> somefile
If this invocation actually proves to be too much of a burden on your host,
a CPU upgrade may be in order. Sadly, tr doesn't support having the
original text to be translated provided to it by any means other than its
stdin, so you still have to pipe from echo.
If you use 'time' a large saved mailbox, and a sandbox config, you could
get an idea as to the CPU overhead for the additional processes (bear in
mind that you'll have cacheing going on)
Of course, the easiest solution might be to make the
program which periodically reads the file and writes it to the
MySQL database a bit more intellegent when it comes to handling
each line - as long as the wildcard problem is limited to an
extraneous embedded linefeed.
Ah, the "dbase" file you refer to isn't the live database, it is merely an
interrim file? Why isn't the data submitted directly to the live SQL
db? If you did that in realtime, the CPU cycles (as well as intermediate
disk writes) you would save would probably make up for any added overhead
of making the initial submitter a bit more intelligent, and you'd have just
one helper app - the app which takes the provided data and outputs it to
the SQL database.
---
Sean B. Straw / Professional Software Engineering
Procmail disclaimer: <http://www.professional.org/procmail/disclaimer.html>
Please DO NOT carbon me on list replies. I'll get my copy from the list.
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail
|
|