ietf-822
[Top] [All Lists]

Subject field hacks and machine processing of "only human-readable information"

2004-11-30 17:50:52

On Tue November 30 2004 18:27, Keith Moore wrote:

what's wrong with abbreviations?

Not much per se.

why should I have to say "reply to  
your message: Attempts at establishing harmful conventions" instead of 
using "Re:"?

If this were a century ago and taking place via pieces
of paper, something like "Re: your correspondence of
30 November" might be appropriate.  For quite some
time there has been provision in the message
format for something similar (via the In-Reply-To
field, which until 2822 was permitted to contain a
phrase such as the above), However, it has been
unnecessary to do so in the Subject field (that is
redundant with In-Reply-To).  In the specific
instance cited, you could simply leave the Subject
field unaltered, or change it if the topic changes
(as I have done in this message).  Unlike century-old
paper correspondence, modern electronic messages
carry with them a set of references to previous
correspondence, making explicit reference in the
Subject field unnecessary.

more to the point, why is it a bad idea for the subject  
to indicate that the message is a reply or some other kind of response? 

It's redundant, as noted above. That alone is a
small flaw; it gets really bad when mindless
machine processing happens with the assumption
that the string "Re: " is *always* an indication
of a response or that it is *always* redundant cruft
added to indicate that a message is a response.

  why is it a bad idea for the subject to give some indication of 
message topic?

It's not; but unless we're discussing Rhenium, or
musical notes, etc. "Re" isn't the topic per se.

where do you draw the line between what belongs in the  
subject and what belongs elsewhere?

Personally I would draw it at the point where a
human author indicates a topic (either directly
in a message, or in canned boilerplate for an
automated response).  I draw the line there because
the Subject field is defined as containing "only
human-readable content with information about
the message", and to be consistent with the
architectural principle of end-to-end (in this
case human to human) communications.  Once
software starts mindlessly prepending cruft, other
software begins mindlessly removing particular
strings under the assumption that such strings
are always cruft.   And then the humans attempting
to communicate lose.
 
I don't think this is cruft - I think it's useful information that 
reasonably belongs in the subject field.  what's broken is something 
else - perhaps our expectation that all messages in a thread should 
have the same subject.

What's broken is the assumption that the Subject
field is structured with "Re: " being some sort of
magic token which always indicates a response.
As noted above it is unnecessary, as a response
is indicated by the presence of an In-Reply-To
field (and UA authors may write software that
takes that into account when displaying a message
or message list by indicating that a message is
a response).  The same applies to mailing list
messages where insertion of cruft into the Subject
field is redundant with the presence of List- fields.
Were I a chemist searching for messages whose
topic is element 75, I should be very frustrated
indeed.  The prevalence of the particular "Re"
hack is so bad that there have been drafts
proposing that all instances of "Re" and similar
content be discarded (recursively) when searching
Subject fields or collating messages by subject.
There are implicit assumptions that given a
message whose header contains
  Subject: Re: the element with atomic number 75 
that:
1. that message is a response to some other message
2. that the "real" topic is "the element with
    atomic number 75", i.e. that "Re: " is mere cruft
3. that somewhere there is some related message
    with
      Subject: the element with atomic number 75
In reality, none of those assumptions may be
correct.


<Prev in Thread] Current Thread [Next in Thread>