procmail
[Top] [All Lists]

Re: bad message id's

1998-03-18 16:35:02
Sean Straw asked (about these recipes for testing the validity of
Message-Id:'s),

| As for newlines, I would have thought that Procmail would have
| reconcatenated it into a single line, ala To, cc, and received.  I don't
| think there'd be any newlines in it at that point.  Could someone more
| knowledgeable about such internals confirm?

In examining header fields with continuation lines, procmail treats the
embedded newline as a space and would match it to a space (or an unescaped
magic period) in the regexp.  Such embedded newlines definitely do NOT match
^ nor $.

The space(s) or tab(s) that indent the continuation line are preserved,
so a space that represents an embedded newline will always be followed by
another space or a tab, but I can find no way to write a regexp that diffe-
rentiates, for example, <embedded newline><tab> from <literal space><tab>
entirely within procmail, and in a legitimate Message-Id: the former would
not need to be quoted but the latter would.

One can fork oneself silly, of course, using other programs:

:0h
msgid=|formail -zUxMessage-Id: | sed -e 'H; $ !d; g; s/\n[      ]*//g; p'

:0
* msgid ?? ^^<([^"      @>]|\\"|"(\\"|[^"])*")+(_at_)\
              ([^"      @>]|\\"|"(\\"|[^"])*")+>^^
routine_for_legit-looking_IDs

and I'm sure that that will still miss some legitimate IDs.

<Prev in Thread] Current Thread [Next in Thread>