"Mark" == Mark Crispin <mrc(_at_)cac(_dot_)washington(_dot_)edu> writes:
On Sat, 22 Feb 2003, Andrew Gierth wrote:
Mail systems do not appear to make any actual use of the
message-id header other than for logging (and internally in some
odd systems).
Mark> Huh? Message-ID, In-Reply-To, and References are heavily used
Mark> in mail for threading. IMAP even has a THREAD facility that
Mark> stipulates how to use these.
perhaps I was insufficiently precise: I was referring specifically to
the transport & distribution of messages rather than the clients.
None of SMTP, POP3 or IMAP ever use message-ids as protocol
parameters. News, on the other hand, uses message-ids as protocol
parameters _all the time_, both for readers accessing messages and for
server-to-server transfer.
Mark> Let's try to translate your comment into something useful. Is it your
Mark> contention that:
Mark> Some set of commands in NNTP use message-id as an argument.
Mark> The syntax for that command precludes the use of space in
Mark> that argument, and/or NNTP lacks the type of quoting mechanism
Mark> found in IMAP and SMTP.
Mark> If so, please provide the details.
the NNTP spec can be found in RFC 977 (and the "Common extensions"
document, RFC 2980 if I recall correctly)
Message-ids appear in NNTP as both command parameters and response
parameters (neither of which can contain spaces, though this isn't
spelled out for command parameters - it's just a consequence of the
fact that spaces separate parameters and there is no quoting or
escaping mechanism). For ARTICLE/HEAD/BODY/STAT, the command parameter
can be either a message-id or an article number (or omitted), but the
response string always contains the message-id as a parameter. For the
CHECK/TAKETHIS extension, the message-id must be given as a parameter
and is returned in the result, and the appearance of the id in the
result is always used by the sending server (to match up commands and
responses, since CHECK/TAKETHIS are used for pipelined transfer). For
IHAVE, the message-id is a parameter but does not necessarily appear
in the result.
Breakage on CHECK/TAKETHIS is particularly bad because the resulting
syntax error response does not allow the sending server to properly
identify the failing message; this usually halts the flow of articles
on the affected link as the sending server continually retries the bad
article (until the sending system's admin notices and manually deletes
the bad entry from the outgoing queue). This is why it's important for
servers to actively reject bad message-ids upfront rather than try and
propagate them.
The length limitation of 250 is from existing software rather than
from previous standards (NNTP restricts the total length of a
command-line to 512 including the CRLF). 250 is a better choice
because (a) values that long do not legitimately occur in practice,
(b) there is no particular reason why excessively long values should
be allowed, especially given the large number of CHECK commands, (c)
the shorter limit allows for extensions to add parameters to commands
such as CHECK without also having to extend the command-line limit.
(and, FWIW, I checked about a hundred thousand message-ids taken from
mail messages here, and only one of those had whitespace in and that
was a Chinese spam which had four spaces in place of the domain-part,
almost certainly due to misconfigured or broken spamming software.
Mark> That isn't an example of a syntactically valid RFC 2822 message-id.
it wasn't intended to be. I found no examples of syntactically valid
RFC 2822 message-ids in my mail archives which were not also valid for
news.
Mark> And the IETF/IESG is supposed to respect this?
yes
Mark> If NNTP has a limitation such as suggested above, then that
Mark> would add a transport imposed requirement to a document that
Mark> otherwise uses RFC 2822 and MIME as normative. That is
Mark> something that can be respected.
Mark> That is different from duplicating all of RFC 2822 and MIME
Mark> just to insert transport imposed requirements. Doing so, which
Mark> the Usefor document has done, is ludicrous.
I entirely agree (and you'll find a long string of comments to that
effect from me in the Usefor archive).
Mark> What's worse, it has the highly undesirable effect of cloaking
Mark> the transport-imposed requirements.
Mark> A small document, such as Kohn's draft, which has a section of
Mark> "NNTP transport requirements" would show these requirements in
Mark> clear contrast.
it is not, in fact, merely an NNTP issue; message-ids are a key part
of _every_ news transport method (since they are used to detect
duplicates). Every news server has to keep a history database (or the
moral equivalent) indexed by message-id, whether it uses NNTP or not.
Message-ids are passed around a lot internally, stored in queue files,
listed in ihave/sendme control messages for UUCP feeding, appear as
parameters in control messages, and generally pervade the entire
structure of news; whitespace or excessive length breaks things in
really quite a lot of places.
--
Andrew.