procmail
[Top] [All Lists]

Re: Outbound mailing lists

1996-07-04 17:23:35
When I wrote,

| > The proper form is
| >
| >   * ^TOAllArchitects
| >
| > or in more recent versions, ^TO_ might be better for your purpose than ^TO:
| >
| >   * ^TO_AllArchitects

Stan Ryckman brought up a question he has had for a while:

| Just *why* have these macros been defined without requiring some
| sort of separator from what they follow?  This bizarre syntax
| makes the syntax highly confusing, perhaps even ambiguous.  What if
| you want something starting with "TOA"?

I can't answer the "why", but I can answer the "what if".  If there is a
caret before it, procmail will read ^TOA as (^TO)A.  To anchor a literal
TOA at the start of the line, you can use ^(TOA) or ^\TOA or ^T\OA.  Most
searches are case-insensitive, so you can often use ^toa.  Procmail will
not mistake ^to for the ^TO token, even without the `D' flag.

| ... or what if you want something
| that was sent to _AllArchitects but have "TO" and not "TO_"?

If you have ^TO but not ^TO_ then ^TO_AllArchitects is no problem;
it's only if you have both and want (^TO)(_AllArchitects).  Then
^TO(_AllArchitects) or ^TO(_)AllArchitects or ^TO[_]AllArchitects are
among the possible solutions, and so is your own suggestion below of
^TO\_AllArchitects.

Again, in most cases the search is case-insensitive, so at the division there
is a switch to lower case: ^TOhouseplants, for example.

| I suggest requiring something like:
|   * ^{TO_}AllArchitects

If you feel you must have a separator, this already works:

    * (^TO_)AllArchitects

| [or]
|   * ^TO_\AllArchitects

That works too in the existing code.  If you like it, then it is a good
solution to your objection, at least when the character after the backslash
doesn't lose or gain magic when there is a backslash before it.  Parentheses
around the token are more robust than a backslash after it, but in most cases
the character after the token will be a letter or a digit, so a backslash
will not hurt.  And if the first character of the pattern that matches the
addressee isn't a letter or a digit, then that in itself provides a visible
separation.

| but the magic of "TODAY" matching any "To: DAY" is bizzare IMHO.

TODAY doesn't match "To: DAY".  ^TODAY would, though.  You have to keep in
mind that there is no token expansion if there's no caret.

| And 'TO.*DAY' makes perfect sense to me.

Again, TO with no caret is literal text to procmail, so anything starting
with "TO" will not run into expansion.  ^TO.*DAY, however, is a bad idea,
because it eviscerates ^TO.  ^TODAY will not match "To: FARADAY", for
example, but ^TO.*DAY will, and you usually don't want it to.

| Just please do *something* that makes syntactic sense.  Knowing that a
| string "ABCDEFGHI" has an implicit token terminator after the "D" is
| syntactically bizarre (for example).

Stan, it's just not what you're used to.  "Bizarre" is a very strong judg-
ment when something merely is different from one's own tastes.  You can write
your procmailrc code with a separator there, as in the examples I gave above
and the one you came up with yourself.  If you're more comfortable with some
visible separation, then by all means use some.  It *is* permitted, and I do
not understand why you want it to be mandatory.

| Otherwise, how is anyone to know just *what* characters up front
| might trigger a macro?  Yes, I can remember TO and TO_ ...

Well, forget them.  Remember ^TO and ^TO_ instead.

| ... but when the next procmail comes out with more such "invisible" macros,
| weird things will happen.

They aren't invisible.  They're detailed in the procmailrc(5) man page.  And
you certainly do have a point: such tokens shouldn't be allowed to prolifer-
ate beyond all bound.  So far there are only two tokens that act that way:
the other two existing tokens, ^FROM_MAILER and ^FROM_DAEMON, are meant to
stand alone and represent patterns for an entire header line, so one doesn't
concatenate any text to the end of them.  (One might alternate something with
them, but then one uses a pipe symbol, which qualifies as a noticeable sepa-
rator.)  So all we have to remember are ^TO and ^TO_.  And if you prefer to
write (^TO) and (^TO_) in your own code, go forth in good health.

| OK, I'm done griping.  But authors, please take note.

Stephen is the only (and very singular) author.

| > ^TO and ^TO_ already end in an expression that is better than ".*";
| > adding ".*" after them seriously weakens their function.

| How does this weaken them?

The tokens expand to expressions that won't be matched by anything ending in
the middle of a word.  (The difference between ^TO and ^TO_ is one of inter-
pretation of what consitutes the middle of a word.  For ^TO, any character
except a letter or a digit is a word break, but for ^TO_ hyphens, under-
scores, and [embedded] periods are not word breaks.)  The full expansions are
in the procmailrc(5) man page.  That way you'll get a match only if the
text you specify after ^TO or ^TO_ appears at the start of a word.  Inserting
".*" undoes that, because the ".*" will match anything, including the first
part of a word that has your specified text starting somewhere in the middle.
Using "^TO_.*stan" instead of "^TO_stan" (or "^TO_\stan" if you prefer) will
match mail addressed to user names like dunstan and constance, to someone who
mentions Afghanistan in the (Real Name) part, and to people-no-one-can-stand.
(The last will also match "^TOstan", but the first three will not).  [Yes, I
see that your logname is stanr and not just stan, but it's hard to come up
with examples for that.]

Now if Mark had wanted to match HAllArchitects and MAllArchitects and
SmAllArchitects Well-Of-All-the-GAllArchitects as well as AllArchitects, then
he'd have a good reason for using ^TO.*AllArchitects.  But I would guess that
he doesn't want to; and if he did, the ".*" would be there to match the
addressee, not to match the text leading up to the addressee's name.

But all told, it can be done in one way that you dislike but also in many
that you don't dislike, so I think the answer is for you to select a syntax
that pleases you, not to have the code rewritten so that everyone else has
to follow your choice.

<Prev in Thread] Current Thread [Next in Thread>