Re: Comments on draft-resnick-2822upd-02.txt

I just got through reading Ned's response. Glad I don't have topolish my own responses, as his are almost entirely spot-on. Tothings that Ned didn't address (or to which I have additions):


On 8/15/07 at 9:44 PM +0000, Charles Lindsey wrote:

 >   messages.  This specification is a revision Request For Comments
                                               ^
                                               of


Got it.

That last sentence seems wrong/confusing, given that the documentspecifies everything in terms of US-ASCII, presumably with theintent that US-ASCII should be the normal means of interchange overthe 'wire' between agents (in the absence of explicit agreementotherwise).

You can have a protocol that sends messages over the wire in UCS-2 orUCS-4 (or a file format that so stores them). So long as they useonly code points 1-127, they are legal 2822 messages. (Over the wirein US-ASCII might imply septets instead of octets. We certainly don'twant that.)

 >   There are two limits that this specification places on the number of
 >   characters in a line.  Each line of characters MUST be no more than
 >   998 characters, and SHOULD be no more than 78 characters, excluding
 >   the CRLF.
Can we de-emphasise that SHOULD, and make it clear that this is amatter of good practice (in the sense of BCP) rather than anormative feature?

It's not just good practice. Some agents screw up the display of longlines as to make them unreadable to the user, and that's aninteroperation problem. I believe some old ones actually choked onlong lines (see below).

Where did that '78' come from? I am aware of lots of systems that dohorrid things such as you mention if there are 80 characters in aline, but I am aware of none where problems arise with exactly 79.In other fora where I have seen this discussed, the consensus wasthat exceeding '79' was the signal for troubles to start.

My memory (and you may wish to search through the DRUMS archive; I'mnot so motivated at the moment) was that there were some old systemsthat had fixed 80 character records which had room for 78 plus the CRplus the LF. 78 was considered the safest.

 >   Each header field should be
 >   treated in its unfolded form for further syntactic and semantic
                                             ^^^^^^^^^

   evaluation.


'Semantic' yes, but why is that 'syntactic' there?

OK, I see what you're asking. You're saying that if you want tosyntactically see whether something is an address, it may containfolding (syntactically), so there's no need to unfold to do"syntactic evaluation". I was thinking of, "You can't just randomlychoose some line in a message and see if it's syntactically alegitimate field, because that line might be the result of a fold".(*Shrug*) I can't get excited about making a change.

 >3.2.2.  Quoted characters

We have already noted that no-fold-quote, and no-fold-literal can go.

No-fold-quote is gone in message-id (though still accepted in theobs- syntax). I am still not sure what to do about no-fold-literal.

But, as I have pointed out in a separate thread, you would remove asevere interoperability problem with Netnews if you removed it from<dcontent> as well (allowing just a "\" to appear as a normalcharacter).

There are too many implementations that have a dcontent (and qcontentand ccontent) parser that will not deal with free "\" in any suchconstruct. So the only thing we could do would be to abolish "\"completely in dcontent. And this is a path that I think would beterrible to start down. So, no, I don't think we can make this change.

 >   within the range -9959 through +9959.

why not "within the range -2359 through +2359"?

I invite you to write up the review of the DRUMS discussion on this,provide text, and tell us why we should change it.

 >   Because the list of mailboxes can be empty, using the group construct
 >   is also a simple way to communicate to recipients that the message

   was sent to one or more named sets of recipients, without actually
   providing the individual mailbox address for each of those
   recipients.


s/each of/any of/ or s/each of/some of/


Done.

 >   A liberal syntax
 >   for the domain portion of addr-spec is given here; it is left to
 >   other specifications (e.g., [RFC1034], [RFC1035], [RFC1123],
 >   [I-D.klensin-rfc2821bis]) to give more precise limitations on the
   syntax.
Can we strengthen that by saying that the 'liberal syntax' MUST befurther restricted to conform to some published specification suchas the ones you have listed (without precluding further suchspecifications in the future, of course)?

Like Ned, I'm opposed to the MUST, but would this suffice (and get usout of having to change the syntax for dcontent for message-id if wedo a similar thing there)?

"Note: A liberal syntax for the domain portion is given here.However, the domain portion of addr-spec contains addressinginformation used in other protocols (e.g., [RFC1034], [RFC1035],[RFC1123], [I-D.klensin-rfc2821bis]). It is therefore incumbent uponimplementations to conform to the syntax of addresses for the contextin which they are used."

It's relatively strong language, but stops short of a compliancestatement that, as Ned said, could only be satisfied by consulting anincomplete and open-ended series of other specifications.

There may be other transport mechanisms than I-D.klensin-rfc2821bis.So it would be better to say "is covered in separate documents suchas [I-D.klensin-rfc2821bis]".


No problem.

Why is Keywords unlimited (in Netnews it is 1)?

I don't know, and I don't know for Comments either. Anyone? Is itworth changing?

...some people 'munge' their From: addresses in order to appearanonymous, or to confuse address harvesters. Whether that is adesirable practice or not is none of our business, but a normativeinterpretation of those words would seem to rule it out.
[...]
 >   In all cases, the "From:" field SHOULD NOT contain any mailbox that
   does not belong to the author(s) of the message.  See also section
   3.6.3 for more information on forming the destination addresses for a
 >   reply.

Wanting to appear anonymous or confuse address harvesters seemssquarely in the category of "there may exist valid reasons inparticular circumstances when the particular behavior is acceptableor even useful, but the full implications should be understood andthe case carefully weighed before implementing any behavior describedwith this label." [RFC 2119] So, the normative language seems good aswell as the potential violation.

 >   The destination fields specify the recipients of the message.  Each

   destination field may have one or more addresses, and each of the
   addresses indicate the intended recipients of the message.  The only

              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
              indicates an intended recipient


Got it.

 >   "References:" field may be used to identify a "thread" of
                        ^^^^^^
                       is often


Why?

It would be useful to mention that when the References field getstoo long it MAY be pruned (the minimum requirement being to retainthe first and the last two entries - including the one just beingadded). I have known of cases where References fields grew to such alength (and MUAs in the followup chain had failed to introducefolding, or even removed folding already present) that the 998 limitwas breached with disastrous consequences.

I am loathe to put in pruning instructions at this point, and withoutsuch instructions, I don't see what else to say.

It would be useful to say here that two msg-ids can always becompared for equality by a simple octet-by-octet comparison (but, ofcourse, one would first have to ensure that property was true).

I also don't want to put threading instructions into this document,which is the path the above starts down.

so if, instead of
    the string "Re: " (from the Latin "res", in the matter of)
you write
    the string "Re: " (an abbreviation of the Latin "in re", meaning "in
    the matter of")
all will be correct.

OK.

 >   The "Received:" field contains a
 >   (possibly empty) list of tokens followed by a semicolon and a date-
   time specification.  Each token must be a word, angle-addr, addr-
 >   spec, or a domain.
Can be find a better word instead of "token" here? "Token" usuallymeans some sort of keyword (e.g. as used in the MIME standards).


I kinda like "token". "Lexeme" seems too syntactic. "Item" seems too generic.

 >3.6.8.  Optional fields

 >   Fields may appear in messages that are otherwise unspecified in this
 >   document.  They MUST conform to the syntax of an optional-field.

   This is a field name, made up of the printable US-ASCII characters

 >   except SP and colon, followed by a colon, followed by any text which
 >   conforms to unstructured.

This is misleading, because it has to cover all new header fieldsintroduced by extensions and these will be, in general, structured.


That's not what that says. It says that it will conform to unstructured syntax.

 >4.  Obsolete Syntax


   Earlier versions of this specification allowed for different (usually
   more liberal) syntax than is allowed in this version.  Also, there
   have been syntactic elements used in messages on the Internet whose
   interpretation have never been documented.  Though some of these

    ^^^^^^^^^^^^^^                                     ^^^^
    interpretations                             Eh? I thought none of them
                                                was to be generated.


OK. I'll fix those.

 >      Note: The "period" (or "full stop") character (".") in obs-phrase
But this is not an "obsolete" construct. We discussed this around 12months ago, and the consensus then was that it ought to be renamedas an <extended-phrase>, and moved out of the Obsolete Syntax.

There was no such consensus; you were the only one who ever suggestedit on this list. And I still see no reason to change it (as I statedback then).

The syntax given for these obs-constructs includes also the syntaxfor their regular counterparts, which makes it very hard work todiscover exactly where the difference lies because of the hugeredundancy that is introduced. For example, if you had written
   obs-qp        =       "\" %d0
nothing would have changed, but it would be immediately obvious whatthe difference was.

I will try to fix some of these. Certainly obs-qp is easy. But onlythe obvious ones.

 >4.3.  Obsolete Date and Time

This lot was particularly difficult to spot the differences.

I'm not sure I understand why. I'd prefer to leave it as is. Therehave been enough bugs in this section already that occurred by tryingto over-simplify the syntax.

 >   addition, local-part is allowed to contain quoted-string in addition
                         ^^                                  ^^^^^^^^^^^
                        was


"Is" allowed in this syntax.

 >   to just atom.  Finally, ....
    ^^^^^^^^^^^^
    in lieu of any of those period-separated atoms


That's incorrect. You can mix atoms and quoted-strings.

 >6.  IANA Considerations


   This document has no actions for IANA.


Oh yes it does!


Oy. Let me see what I can do about that.

 >   Messages are delimited in this section between lines of "----".  The
   "----" lines are not part of the message itself.
That is indeed an excellent notation. The Bad News is that you havenowhere used it :-( .


I'll try figure out how to do something useful in xml2rfc.

 >   characters (the double-quote characters appearing as quoted-pair

   construct).  ...

    ^^^^^^^^^
    constructs


Got it.

 >   In this message, the "To:" field has a single group recipient named A
                                                                       ^
                                                                       "

   Group which contains 3 addresses, and a "Cc:" field with an empty

         ^
         "


Yup.

Wouldn't it be better to show a Bcc: header for the "Undisclosedrecipients" example?


I don't understand what you mean.

 >   The above example is aesthetically displeasing, but perfectly legal.
Though legal, you should point out that it contains things that aredeprecated by 3.3 and by 3.4.1

Nope. A.5 does not (or shouldn't unless I missed something) containanything not perfectly permissible in 3.3 and 3.4.1.

RFC 1036 is not actually referenced anywhere in the document.


Removed.

pr
--
Pete Resnick <http://www.qualcomm.com/~presnick/>
QUALCOMM Incorporated - Direct phone: (858)651-4478, Fax: (858)651-1102