On Sun, Jan 23, 2005 at 04:26:09PM -0500, Glenn Sieb wrote:
Dallman Ross said the following on 1/23/2005 6:53 AM:
Grouping, and speed, and efficiency, and elegance, are all highly
useful. I didn't mean to imply otherwise. I'm glad you've thought
of some of them! What I was objecting to was extraneous effort, but
not the core concepts. Here are a couple of quick ideas for your
task:
:0
* ^TO_root(_at_)\/(Wingfoot|Database)[.]org\>
* MATCH ?? ^^\/[^.]+
{
LOG = "$NL$MATCH Root Mail$NL"
:0
.$MATCH.Root/
}
Oh! Nifty indeed! :) I see (from the manpage) that MATCH will contain
anything past the \/ token. Overall, I can certainly see how this can
make it TONS easier to do my FreeBSD list recipes, as well!
Yup. Glad you found it useful.
:0
*
^List-Post:.*\/(advocacy|announce|chat|config|doc|hackers|jobs|newbies|performance|ports|questions|stable|testing|user-groups|www)@.*freebsd\.org
* MATCH ?? ^^\/[^(_at_)]+
{
LOG = "(FreeBSD $MATCH Mailing List)$NL"
:0
.FreeBSD.$MATCH/
}
Am I getting the general gist here, I hope? :) The light is beginning to
turn on here.. :) Let me see if I can explain it in English how I'm
seeing this...
Your explanation (snipped) is pretty much it, yeah. Let's take it even further.
MYBSDLISTS = '(advocacy|announce|chat|config|doc|hackers|jobs|newbies|\
performance|ports|questions|stable|testing|user-groups|www)'
# now you can grow it or edit it at will without disturbing a working recipe
If we use hard quotes instead of double-quotes, we can break the long line as
I did within an assignment without inadvertently saving the whitespace, too.
HOSTCLASS = '[a-zA-Z0-9-]+'
I'll bet you see where I'm going with the rest. Note that now we need a
"$" expansor on the condition.
:0
* $ ^List-Post:(.*\<)?\/$MYBSDLISTS@($HOSTCLASS[.])*freebsd[.]org\>
* MATCH ?? ^^\/[^(_at_)]+
{
LOG = "(FreeBSD $MATCH Mailing List)$NL"
:0
.FreeBSD.$MATCH/
}
Notes:
Q: Why "(.*\<)?"
A: Because we want to match on "ports" but not on "sports", for example.
Note that even this isn't perfect; we could still get a false match
on "starboards-and-ports", for example. But one has to decide when
enough rigor is enough. Otoh, one could code in whitespace instead
of using the procmail "word-separator" macro. I think we've reached
the edge of where we need to be worried about false matches already,
however, personally.
Q: Okay, so why not just ".*\<", then?
A: Because there's no requirement that there be any whitespace after a
header-field colon.
Q: Why are we doing the "($HOSTCLASS[.])*" bit?
A: Because any of "freebsd.org", "foo.freebsd.org", "foo.bar.freebsd.org",
and so on should work, but not "carefreebsd" or "germfreebsd".
Q: Why are we putting the procmail inter-word macro on the end?
A: Because we want to match "freebsd.org", but not "freebsd.orgasm.cn".
The further "Note" up above also applies here.
Q: What about the letter-case of my match?
A: Frankly, you could get odd letter-case matching depending on the
sender, his mail program, or intermediary hosts. If that's highly
important to you to standardize, then you'll have to code an
extra test in for letter-case and respond accordingly.
Q: Since procmail is case-insensitive by default, why did you put "a-zA-Z"
in the $HOSTCLASS?
A: Because we might want to use the var in a recipe with the D (case-
sensitive) flag. These are generic, re-usable vars we're setting.
Thanks so much for your help, Dallman! This is giving me tons of ideas! :)
Good 'nuff!
--
dman
____________________________________________________________
procmail mailing list Procmail homepage: http://www.procmail.org/
procmail(_at_)lists(_dot_)RWTH-Aachen(_dot_)DE
http://MailMan.RWTH-Aachen.DE/mailman/listinfo/procmail