Hi,
recently I have had some problems with messages being threaded
inappropriately in my (IMAP) inbox.
Being the kind of person I am, I more or less hunted it down, finding these
gems:
1) Header:
In-Reply-To: Message from Harald Tveit Alvestrand
<harald(_at_)alvestrand(_dot_)no>
of "Fri, 06 Jul 2001 13:17:51 +0200."
<759246899(_dot_)994425471(_at_)[192(_dot_)168(_dot_)1(_dot_)31]>
2) RFC 2822:
4.5.4. Obsolete identification fields
The obsolete "In-Reply-To:" and "References:" fields differ from the
current syntax in that they allow phrase (words or quoted strings) to
appear. The obsolete forms of the left and right sides of msg-id
allow interspersed CFWS, making them syntactically identical to
local-part and domain respectively.
obs-message-id = "Message-ID" *WSP ":" msg-id CRLF
obs-in-reply-to = "In-Reply-To" *WSP ":" *(phrase / msg-id) CRLF
Note the absence, inherited from 822, of anything indicating the proper
content of a "phrase".
3) THREAD specification - draft-ietf-imapext-thread-07, section 6.3:
If a message does not contain a References header line, or
the References header line does not contain any valid
Message IDs, then use the FIRST (if any) valid Message ID
found in the In-Reply-To header line as the only reference
(parent) for this message.
Note: Although RFC 822 permits multiple Message IDs in
the In-Reply-To header, in actual practice this
discipline has not been followed. For example,
In-Reply-To headers have been observed with email
addresses after the Message ID, and there are no good
heuristics for software to determine the difference.
This is not a problem with the References header however.
If a message does not contain an In-Reply-To header line, or
the In-Reply-To header line does not contain a valid Message
ID, then the message does not have any references (NIL).
My capitalization.
Even more worrisome is this header:
References: <harald(_at_)alvestrand(_dot_)no>
<257869316(_dot_)998951974(_at_)[192(_dot_)168(_dot_)1(_dot_)31]>
<200108281438(_dot_)f7SEci101350(_at_)hygro(_dot_)adsl(_dot_)duke(_dot_)edu>
<E15bttt-0004pr-00(_at_)roam(_dot_)psg(_dot_)com>
<4475164(_dot_)999069825(_at_)localhost>
which was apparently created on the basis of a different incarnation of the
previous one, following the algorithm of RFC 2822 section 3.6.4:
The "References:" field will contain the contents of the parent's
"References:" field (if any) followed by the contents of the parent's
"Message-ID:" field (if any). If the parent message does not contain
a "References:" field but does have an "In-Reply-To:" field
containing a single message identifier, then the "References:" field
will contain the contents of the parent's "In-Reply-To:" field
followed by the contents of the parent's "Message-ID:" field (if
any). If the parent has none of the "References:", "In-Reply-To:",
or "Message-ID:" fields, then the new message will have no
"References:" field.
(It seems to have missed the part about In-Reply-To field containing a
single messgae identifier, though...)
The product creating the initial problem seems to be an MH variant's
group-reply-to init file (replgroupcomps); we can fix it one install at a
time, but this seems like a glorious time for clarifications....
Suggested fixes:
1) In RFC 2822bis, state that the msg-id form of obs-in-reply-to MUST
contain a message-ID, and NOT an email address (a "phrase" cannot contain
an unquoted angle bracket, so it is only the msg-id that allows it)
2) In RFC 2822bis section 3.6.4, state that the In-reply-to should only be
used to form References if it has a single message-ID, and that the reason
is that users of obs-in-reply-to sometimes put emails in their in-reply-to
fields.
3) Unless someone says that a reasonably widespread implementation exists
that puts FIRST a message-ID and THEN an email address into the in-reply-to
field, change the THREAD specification to pick up the LAST instead of the
first identifier.
What do people think?
Harald