[Top] [All Lists]

Re: New 2822upd-04 - obs-NO-WS-CTL

2008-01-16 12:36:17

On 1/16/08 at 12:37 PM +0000, Charles Lindsey wrote:

>In <p06250100c3b2cdde7d7b(_at_)[74(_dot_)134(_dot_)5(_dot_)163]> Pete Resnick
><presnick(_at_)qualcomm(_dot_)com> writes:
>A couple of typos:
>In 2.2 s/ unfolding"/ "unfolding"/
>In 3.2.2, %d42-91 appears to have got omitted from <ctext>.

Good catch! Don't know how I missed those.

> This is a good time to ask how long it is intended that the
> obs-syntax should remain in the standard. One hopes that it will no
> longer be a REQUIREMENT to recognize it in 1000 years time, but
> maybe some earlier deadline might be in order.

> How burdensome is it to have to recognize this ancient stuff? I
> suspect that NUL and naked CR and LF are the bits that cause the
> most problem - the rest is probably easier.

I cannot claim to have a definitive answer here, but our experience has been
the exact opposite. Way back when issues with this stuff first cropped up
cleaning up our handling of CR and LF was trivial. NULs were only slightly

Compare this with support for, say, stuff like:


Dealing with this stuff is nasty - it makes parsers much more complex, and even
more important the "right thing" to do when stuff like this is received can be
tricky. A bunch of iterations were required before we were content we had most
of the cases right. (And don't get me started on the heuristics needed to deal
with the stuff that's not even legal according to obs-syntax but commonly used
nevertheless.) This is in sharp contrast to the situation with illegal
characters, where to be blunt the major source of pain has come from shifting
winds in the standards community. (I note in passing that I am not a fan of the
present state of affairs in regards to bare CR and LF in 2821bis, but that's
not especially relevant here.)

> So one might like to
> remove the NUL and naked CR and LF sooner rather than later. For
> sure, there should no longer be a need to recognize ANY obs-stuff
> coming off the wire, even today. We are concerned only with ancient
> emails that are still in people's archives.

I'm sorry, but this goes much too far. There's a fair amount of stuff in the
obsolete syntax that I have never seen used other than in examples (e.g.,
multiple quoted strings in local parts). But there's also a fair amount of
obsolete stuff that does occur commonly in practice (e.g., legitimate messages
containing bare CRs and LFs). And there's also a significant middle ground of
stuff that doesn't come up very often (e.g., comments stuck in odd places) but
when it does it  always seems it came from the client the company CEO insists
on using so not supporting it is not an option.

A few things on this point (and I believe we've had this discussion
multiple times before):

1. What would an implementation do, in the face of a bare CR, LF, or
NUL, if the specification did not have this stuff in the obs- syntax?
As it currently says:

       Note: This section identifies syntactic forms that any
       implementation MUST reasonably interpret.  However, there are
       certainly Internet messages which do not conform to even the
       additional syntax given in this section.  The fact that a
       particular form does not appear in any section of this document is
       not justification for computer programs to crash or for malformed
       data to be irretrievably lost by any implementation.  It is up to
       the implementation to deal with messages robustly.

The reason that bare CR, LF, and NUL are in there is because these
things have cropped up in messages over the years and implementations
may need to deal with them. It's fair warning to have these
constructs in the spec.


2. I find the whole concept of deadlines in protocol specifications
silly. If we say, "you MUST accept bare LF until January 1, 2010 at
12:00:00 UTC, at which point you MAY", what have we accomplished? If,
by any particular date, we find that it is unreasonable to ask
implementations to do any particular thing, we change the spec at
that time.

I strongly object to protocol specifications trying to impose
deadlines on implementations. Write an Informational RFC with
prognostications if so desired.

You know, I don't think I'd ever considered the notion of embedding a flag day
requirement in a specification before. Now that it has come up I have to agree
with Pete that it seems extraordinarily silly.

>A. I see that <quoted-pair> is still present in <dcontent>...
>B. As regards the use of SP after the ':' in header fields...

I did not see support on the list for either of your positions on
these. In fact, the only response to your Sept. 20 message on B was
an objection by Frank to doing what you said. Cite some support for
either of these positions if you would.

In my 30-Jul-2007 message I was mildly in favor of A but strongly opposed to B.
Reiterating what I said at the time, I favor A on the grounds that it cleans
things up and seems "mostly harmless" and I oppose B because there's are a huge
number of marginal SMTP clients out there in stuff other than full-fledged
MUAs, including a lot of embedded gizmos that cannot be upgraded easily if at
all, and this is precisely the sort of shortcut you find all the time in that
sort of code.

However, in regards to A and irespective of my own position, I concur that
there was insufficient support shown on the list for making the change.