Sean Straw asked (about these recipes for testing the validity of
Message-Id:'s),
| As for newlines, I would have thought that Procmail would have
| reconcatenated it into a single line, ala To, cc, and received. I don't
| think there'd be any newlines in it at that point. Could someone more
| knowledgeable about such internals confirm?
In examining header fields with continuation lines, procmail treats the
embedded newline as a space and would match it to a space (or an unescaped
magic period) in the regexp. Such embedded newlines definitely do NOT match
^ nor $.
The space(s) or tab(s) that indent the continuation line are preserved,
so a space that represents an embedded newline will always be followed by
another space or a tab, but I can find no way to write a regexp that diffe-
rentiates, for example, <embedded newline><tab> from <literal space><tab>
entirely within procmail, and in a legitimate Message-Id: the former would
not need to be quoted but the latter would.
One can fork oneself silly, of course, using other programs:
:0h
msgid=|formail -zUxMessage-Id: | sed -e 'H; $ !d; g; s/\n[ ]*//g; p'
:0
* msgid ?? ^^<([^" @>]|\\"|"(\\"|[^"])*")+(_at_)\
([^" @>]|\\"|"(\\"|[^"])*")+>^^
routine_for_legit-looking_IDs
and I'm sure that that will still miss some legitimate IDs.