procmail
[Top] [All Lists]

Re: Message-ID syntax

1999-10-18 02:22:32
++ 16/10/99 08:59 -0500 - Philip Guenther:
While it true that the high-level syntax of a Message-Id: header does
not mention comments or whitespace, this is because they both disappear
during the lexical analysis.  To quote rfc822, section 3.1.2:

       Note:  Any field which has a field-body  that  is  defined  as
              other  than  simply <text> is to be treated as a struc-
              tured field.

Then in section 3.1.4:

       To aid in the creation and reading of structured  fields,  the
       free  insertion   of linear-white-space (which permits folding
       by inclusion of CRLFs)  is  allowed  between  lexical  tokens.

Then follows a percise listing of the lexical tokens of a structured
header field.

Wow... i have never noted that before! So, to say it in less 'expensive'
words, another way to say this is that one is allowed to spaces to
increase readabilty? And even another way would be:

        Message-ID: < local_part @ domain_part >

...is a RFC valid Message-ID?

Additionally, but not very important, should this trailing 'D' be
capitalized or not? In the RFC i only see it with a capitalized 'D' at
the end, you almost always write it with a small 'd'.

The reason your condition is match too often is that the at-sign is
doubled in it:
      *$ ! ^Message-Id:$ws<$ws$local_part$ws@@$ws$domain$ws>
                                            ^^
Remove one of those.

Stupid... typo.

Finally, I'll note that rfc822 actually allows comments in Message-Id:
headers (indeed, comments are one of the lexical tokens listed in section
4.1.4).  While it is impossible to match arbitrarly nested parens with
a regular expression, it is simple to match one level of parens, and
given that there's a Banyan Vines MTA that includes a comment in the
local part of the Message-Id: header, I would recommend changing the
'ws' definition to the following:
      ws="[   ]*(\([^()]*\)[  ]*)?"

(Yes, that _could_ be
      ws="[   ]*(\([^()]*\)[  ]*)*"
but I have yet to see a Message-Id: header with two comments in a row,
and I don't feel like that much slack to a loser MTA/MUA writers.)

But strictly taken (the way the RFC tells it us) it is possible to have
two comments in a row. Correct? If so, i prefer the latter one.

Also, the RFC allows comments in both the local and the domain parts? If
not i'll change the regexp a little so it'll will only match comments in
the local part.

Thanks for the clarification.

        -Rejo.

-- 
= Rejo Zenger  [Sister Ray Crisiscentrum]               
rejo(_at_)sisterray(_dot_)xs4all(_dot_)nl
= http://mediaport.org/~sister                PGP: RSA FAE40065, DSS/DH 2C8059B5
--------------------------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>