-----Original Message-----
From: procmail-admin(_at_)Lists(_dot_)RWTH-Aachen(_dot_)DE
[mailto:procmail-admin(_at_)Lists(_dot_)RWTH-Aachen(_dot_)DE]On Behalf Of
Professional
Software Engineering
Sent: Wednesday, September 27, 2000 2:48 PM
To: procmail(_at_)Lists(_dot_)RWTH-Aachen(_dot_)DE
Subject: Re: regexp - question 1
At 11:06 2000-09-27 -0500, tomcat(_at_)visi(_dot_)com wrote:
First, could you do all of us a BIG favour and choose *ONE*
subject to post
your many questions on this one topic? Posting repetative questions under
different topics is not going to assist in your cause.
Second, please READ previous replies and consider applying them before
asking the same or vastly similar questions.
* ^From:+\/.*
{ REALSENDER = $MATCH }
Wrong syntax for + -- it is one or more of the previous expression - which
here, is a colon. So, your syntax is looking for:
From:blah blah
From::blah blah
From:::blah blah
I doubt this is what you want. The pretty much accepted standard of
extracting this info is:
* ^From:[ ]*\/[^ ].*
Both sets of brackets contain a space and a tab - when you see brackets
like this in other messages, esp right after the header name, and where it
appears to contain more than one whitespace, you can generally
assume SPACE+TAB
which is " bozo(_at_)bozo(_dot_)com" If correct, do I want the
leading space there? Aren't I getting a space between
No, since other processing you might end up doing might care about the
space. It is easy enough to add it where you need it than to try
to remove
it. Additionally, if there is NO leading space, when you create new
headers, if you ASSUME there was a space and don't include one yourself,
there won't be one.
formail -cz -I "From: $REALSENDER"
Here, you presumably would NOT want a space in $REALSENDER, because you're
obviously adding one. Not that it would affect the validity of
the message.
I think + here instead of * before \/ because
if there is no line starting out "From:" the
If there is no line starting with From:, then the whole expression
won't be
matched. If you're thinking the + or * modifier applies to the whole text
"From:", you're mistaken. If you _really_ wanted that, you need
parenthesis:
^(From:)+
(but this is patently wrong, since you're not looking for things like
"From:From:")
email will fail anyway (for my purposes)
In that case, should I check for "Reply-To" ??
Messages without From: should generally be considered bad. More likely
than not, they're spam, and if not, then they're being generated by a
braindead application. HOWEVER, if you're using this extracted address as
where you're going to send a reply to, you really should check for
Reply-To. There's an easier way to do this:
# for instances where you want what address is was sent from
:0
* ^From:[ ]*\/[^ ].*
{
FROM=$MATCH
}
# and what address a reply would properly be addressed to
:0 h
SENDER=|$FORMAIL -b -rtzxTo:
Assuming you've defined $FORMAIL to point to your formail executable - or
you could have formail in your path, and replace $FORMAIL with formail.
See the formail man page 'man formail' before asking questions on
the above
options to formail. In fact, now might be a good time to take a break and
check the various procmail FAQs.
Once you have these (I pre-emptively fetch subject and TO as well), you
have stuff you can use in your filter(s) at will.
As defined here, $SENDER is the proper address to mail the sender of the
message - their From: or reply-to, etc, as defined by the RFC-822 ruleset.
Would
* ^From:.+\/.*
{ REALSENDER = $MATCH }
match the whole "From: bozo(_at_)bozo(_dot_)com"
and so "" would be put into REALSENDER ??
Perhaps it is time for you to experiment with manually-invoked procmail
scripts. You should really have started there anyway - experimenting with
filters on your live mailspool would be a fool thing to do, and if
you were
using a test filter, the answers to your questions would be
painfully obvious.
Say you have a message file, or a mailbox (in either case, a file into
which you have stored one or more messages, complete with headers):
formail -s procmail -m testing.rc < your_message_file
This will send your message file into formail, which will SPLIT it up into
its individual messages, handing each one in turn to procmail, which will
run them against the testing.rc ruleset. If it were just a SINGLE
message,
you could skip the 'formail -s' at the beginning, but it's just as well
that you do it this way, because it simplifies things for when the message
file does contain more than one message.
This has *NOTHING* to do with the mail coming in your inbox, so as long as
testing.rc isn't referred to by your .procmailrc (or any INLCUDERC's in
it), and as long as you're not dumping output into a directory overwriting
your _actual_ mailboxes, you can hack it to your hearts content, and not
mess up your regular mail filtering.
Make a testing subdir, and put this stuff in there.
Now, in testing.rc, set up a nice basic .procmailrc type framework:
# -- start testing.rc example
# called from untwit script.
#
# This will take whatever messages in the twits file and re-send them into
# the mailstream for the current user to be processed again, presumably
# under modified rules.
COMSAT=no
# logging, good stuff...
LOGFILE=./testing.log
# LOTS of logging, better stuff.
VERBOSE=on
# Define paths to individual apps we use. At the shell, you can use
# 'which app' or 'type app' to locate the path to the app.
FORMAIL=/usr/bin/formail
FGREP=/usr/bin/fgrep
# default mail delivery mailbox - for my testing purposes, anything NOT
# specifically filtered, goes to the ether (rememmber, we're piping into
# this ruleset from a saved file). For your purposes, you might want to
# set this to ./default.mbox or something.
DEFAULT=/dev/null
# get the sender info
:0h
SENDER=|$FORMAIL -b -rtzxTo:
# may include any other common setup rules, as you'd have them in your
# .procmailrc
# include your test filter.
INCLUDERC=test_filter.rc
# -- end testing.rc example
I use something vaugely similar to post-process my spam file to extract
individual messages from people, add a spam-filtering-bypass header of
sorts, then re-inject them into my regular procmail rules, so they get
stored into the appropriate mailbox and tossed into my mail spool, for
retrieval by my client software (this is for those infrequent occasions
when a message gets mis-identified as spam, and is part of the reason I
don't simply /dev/null my spam). I can also extract various individual
messages from mailboxes as well. But I'm getting OT here..
Now, put the filter rules you want to test into test_filter.rc (or rename
and change the above as appropriate).
An example test_filter.rc - the rule you inquired about above:
# begin test_filter.rc
:0
* ^From:.+\/.*
{
REALSENDER=$MATCH
}
# end test_filter.rc
Now, run the filter, and examine the testing.log file.
Experiment. You'll answer a LOT of your own questions this way. When you
want, you can edit the test message to be precisely what you want
it to be,
and feed that into the test script. Between runs, you'll probably want to
delete the testing.log file.
You might even make a shell script to run the procmail process, then show
the log, and delete the log:
#!/bin/sh
# delete the log from previous run
rm testing.log
# run the test filter
formail -s procmail -m testing.rc < my_message_file
# view the log
less testing.log
# edit the test filter
vi test_filter.rc
Set the script file to have +x attrib (so you can run it).
You'd run the script, the previously existing log would be deleted, the
filters would be processed, the log would be viewed, you could see how the
output worked, and then exit the pager (less), the editor would be invoked
on the test script so you could make tweaks, and run again.
[snip - a LOT of these "would this match THIS" questions that would be
answered with simple tests]
"Subject: Re: MailWeb: Test A xxxxxxxxxxxxxxxxxx 99bozo(_at_)bozo(_dot_)com"
* ^Subject:.[99]*\/.*
{ ORIGINATOR = $MATCH }
would this match
"Subject: Re: MailWeb: Test A xxxxxxxxxxxxxxxxxx 99"
NO. I get the feeling that you did NOT read my previous post about [99]
defining a character class, rather than a literal. You'll find it under
one of the OTHER subject lines you've used for this discussion.
Simple testing of this via the above described method would confirm this.
Is there a limit to how long the subject line can be and
not be chopped off??
LINEBUF chars.
You're unlikely to run into this limit on a header. Bodies are a
different
matter. OTOH, different _mail clients_ are all too likely to cut long
subjects down in size - of any one header to NOT let get too big, this
would be the one.
---
Please DO NOT carbon me on list replies. I'll get my copy from the list.
Sean B. Straw / Professional Software Engineering
Post Box 2395 / San Rafael, CA 94912-2395
_______________________________________________
procmail mailing list
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail