Re: non-member messages to lists (was Re: reply etiquette)


On Oct 02 2004, Bruce Lilly wrote:

Whether you realize it or not you're making quite a few assumptions
which you haven't mentioned:


I'm just playing with a toy model which I thought would be interesting
to discuss. Exploring the implications/assumptions is exactly what's
needed. (unless perhaps I'm too uninformed, I'd rather not waste
everyone's time).

1. That mailing list software is designed to keep track of every
   message sent to the list, and to examine every such message for
   possible relationships to other messages. Even were that always
   feasible (see next assumption), it is likely to be impractical
   especially in the situations where you apparently believe it
   would be most useful ("For example, consider a high volume
   mailing list"). As the number of messages increases, the number
   of comparisons increases quadratically, and the complexity of
   each comparison may also increase rapidly (see below).


If the mailing list doesn't keep an archive of itself, then there's
little hope of making it intelligent at all.

I think that keeping track of high volume mailing lists is quite practical
(but I'm not arguing it's trivial to code). For example, the lkml is a
good example of high volume I think, and it receives 200-300 messages
per day.

That's only about 100,000 messages per year, which is not impossible
to index. Assuming your point 2 below can be solved, then looking up a
unique computed ID of some sort for the parent of each incoming
message, parsing this parent once and taking all actions related to it
(such as sending a courtesy copy) can be done without putting a big
load on the system.

2. That it is always possible to establish relationships between
   messages (Message-ID, In-Reply-To, and References fields are
   optional).


That's a good objection. I would extend this by noting that people
sometimes also change subject lines, so comparing subjects in 
the usual way also is not entirely reliable.

I could answer that a reply message without any form of useful
identification tokens is relatively rare, but that's a copout,
and I don't have any numbers to back this up.

Perhaps a more constructive suggestion would be this: if we 
want threading to work properly on a mailing list, why not
enlist the help of the list server, e.g. in the following way:

Since in most MUAs, the original subject is quoted when a reply is
sent, the list server could imbed a thread id into the subject line of
the messages it sends out. 

It's an ugly hack, but it does make use of what I think is a reliable
lowest common denominator assumption, that MUAs quote the original
subject when replying (with extra modifications such as Re: etc.). 

Also, it doesn't suffer from the problem of recognizing variations in
Re:, Re[2] etc. to identify replies (which is what is problematic now).
Instead, the list server simply scans the subject line of a received
message to see if it contains a quoted thread-id (example below).

Of course, it's ugly to not rely on the specially designed headers
such as References, but your point is that MUAs don't reliably set
them in the first place, so Im working with that. I believe that
quoting/mangling original subjects is much more reliable, enough so
that it could be relied on for threading.

So for example, I might send a message to the list with
Subject: this is an example subject

The list server propagates this message to all list members as follows
Subject: this is an example subject (thread-id 7.0)

This would indicate e.g. thread number 7, root message.

When you reply to the list, your MUA sends a message with the subject
Subject: Re: this is an example subject (thread-id 7.0)

And the list server propagates this message to the list members as follows
Subject Re: this is an example subject (thread-id 7.1)

Somebody else replies to the list address with
Subject: Antw: Re: this is an example subject (thread-id 7.1)

Which is propagated as, you guessed it, 
Subject: Antw: Re: this is an example subject (thread-id 7.2)

3. That it is possible to accurately determine whether a sender is
   a list member (avoiding forgeries where e.g. a spammer forges
   either a message header address field or the envelope sender
   address, while accurately recognizing messages sent by a list
   member from an alternate account, via a public wireless access
   point, via a web mail interface, etc.).


That's a security issue, which exists independently of the 
issue of accomodating list member preferences. What's the worst
that can happen now on mailing lists, and would it get significantly
worse with an "intelligent" list server? 

For the case of a courtesy copy, if every list member wanted one, the
effect would be insignificant, since only the parent poster should
receive a courtesy copy. In other words, on a normal mailing list, a
gate crashing spammer can reach all members, whereas if the list
implements courtesy copies say, the spammer reaches all members, and
the parent poster gets one extra copy.

4. That every person who submits a message to a mailing list
   wishes to receive one or more email responses (that is untrue
   for some people who monitor list activity via a web archive
   of the list, whether or not they are registered as being "on"
   the list).


You're right. If such a person is not a member, then his/her preferences
can't be known. If the person is "on" the list, I'm assuming that if
they receive messages without wanting replies by email, they can and have
selected this preference when they signed up for the list.


You also haven't indicated whether you think that a response to
a response to a message sent by the hypothetical X should also
be mailed to X.


Ok, it's a good question, but I think that's bordering on recursively
reimplementing the mailing list. My personal inclination is that a
response to a response is not a response to the original poster, so
needn't concern him or her. Of course, this issue only matters if the
original poster is not "on" the list, as he will see the full response
thread if he monitors the mailing list.

FWIW, if the list server does keep an indexed archive of messages
organized by threads, then accomodating yet another preference 
(which I would loosely describe as receiving the full response thread) 
is not difficult:

For a newly received message, instead of finding only the parent
poster, the list server walks the chain of parent messages and extracts
all the parents. The total number of recipients is not greater than
the "height" of the thread, which is orders of magnitude smaller than 
the full mailing list volume. So in terms of complexity, even this
preference is scalable to high volume lists.

-- 
Laird Breyer.