Re: more content-charset stuff

Mark Crispin writes:

No matter how much people flame about this, it seems clear that there
is no semantic difference between the two proposals; one proposal
uses the existing header name space; the other creates a new namespace
under content-type.

Although this may appear to be syntactic sugering, there is a major difference
in how the two are represented internally in an application. In the first
case, `charset' is an attribute name, one of an open-ended set of name/value
pairs that comprises the parameter top-level value.


There is? I would use the same code and internal representation for both. They 
are just lists that hang off different places in a data structure.

In the latter case, `Content-charset' is a top-level value, and as such exists
for all types including those which are meaningless.  More importantly, it
creates a new entity which must be parsed at the lowest level.  As a
parameter, no extra parsing is needed, since you need parameter parsing code
for other stuff.


I disagree with this completely and totally. First of all, adding headers does
not change the parsing requirements in any way. Everyone who implements this
stuff already has a header parser (unless they are doing things in a really
bizarre way).  As such, adding headers causes no pain at all in terms of
parser additions.

Adding name=value parameters to the content-type headers does mean additional
parsing. Contrary to what you say, you do _not_ need name=value parameter
parsing at present to build a minimally compliant UA or MTA. In fact, I
presently have a UA and MTA in hand that are considerably more than minimally
compliant. I have yet to use the results of parsing this form of parameter. And
my MTA and UA handle your torture test message just fine, thanks -- the very
few parts that have name=value parameters contain data I cannot possibly make
use of anyway.

Finally, the notion that a charset name=value parameter somehow limits silly
states (your favorite whipping boy) is nonsense. If we have a general
name=value syntax that can be used anywhere, what prevents me from saying

   Content-type: binary; charset=doofus
   Content-type: multipart/digest; 498571490857149574190; charset=doofus
   Content-type: image/g3fax; charset=doofus

and how is this any different from

   Content-type: binary
   Content-charset: doofus

   Content-type: multipart/digest; 498571490857149574190
   Content-charset: doofus

   Content-type: image/g3fax
   Content-charset: doofus

In fact, I would argue that by adding yet another syntax for the specification
of arbitrary and perhaps unrelated parameters, you have effectively multiplied
the number of ways of having silly states rather than limiting them. At least
with headers we only had one such mechanism. We're now headed to a place where
we are going to have two. Thus, by your own desire to limit silly states, which
I share, I claim that your proposal in this area has precisely the opposite
effect from what you claim it will.

Let me ask you something: Are you an RFC-XXXX implementor?  What code are you
writing, and what is the current state?  I am an RFC-XXXX implementor, and I
have finished an implementation.


I think Neil is an implementor, and I know I am one. And I disagree with you
and agree with Neil on this.

                                Ned