Re: content-charset & checksums

draft (it would be nice) but I want to hear some additional voices on
the list saying that this is OK.


Using name=value for parameters suits me fine. Some other matters:

1. I don't think the checksum header ideas floating around are well 
enough thought out to go in rfc-xxxx without a lot of danger of 
slowing things down. Since it can/should be optional can't it wait 
for another rfc? A simple checksum at the end of base64 (introduced
by yet another safe character) seems like a harmless idea if
everyone is happy with it.

2. If ISO-2022 defines an algorithm from a sequence of octets to a 
sequence of glyphs then it is a Character set not noticably different 
in that regard from many others. If it doesn't then its nothing. I 
think the problem is that ISO-2022 as used in Japan only supports JIS 
and ASCII: why not call it ISO-2022J? As far as I can tell no one has 
any hopes for a wider  application of ISO-2022, but supporting the 
Japanese subset seems necessary [it would be nasty to punish them when 
they have been good little 7-bitters unlike some].

Now a problem with ISO-2022 (which 10646/ATM/AUC seems determined
to share) is that the default meaning of octets before any escape
sequence is undefined. We should NOT use the Charset parameter for
this. We should NOT allow the concept of a character set which has
to have extra external information before it is meaningful. If we
have to deal with things like this then we have to register a 
different name for every combination of parameters that people want
to allow. For example we would register ISO-2022-J to mean "ISO-2022
with support for JIS and US-ASCII only, and starting in US-ASCII 
mode". I don't know whether it is possible to have parameterized
character sets in any manageable way. I do know that it is far
too late in the process to be thinking about this. Please lets drop
that possibility and accept one unparameterized name per Character
set: you really can do everything you want in this form because the
parameters only have a small number of useful values in real life.

3. All the headers are Content-something. So why not Content-version
instead of Body-version?

The other question is why have this parameter at all. I gather the
reason is to ensure that we don't get mixed up with the previous
simpler use of Content-type. The original idea was to be compatible 
with that previous use. If that has been abandonned we need some
way to reliably distinguish the rfc-xxxx use. Of the following 3
options I like the version header the least:

  (i)   Change "Content-type" (e.g. to "Content-format")

  (ii)  If we always had a subtype there would be no confusion since
        the old usage alway lacked a subtype. This eliminates the
        default subtype: not a great loss.

  (iii) Yet another header [Body-version or better Content-version].

4. On the header question. Keith's proposal is the front runner
at the moment. We haven't heard any strong (let alone show-stopping)
objections, and that was not the case for any of the other proposals.
So let's either hear the objections or get it out with rfc-xxxx.
I would like to see better alignment of the encodings but I guess we
can live without that. I'd like to see a more complete draft.

5. How about some optimistic sole arranging for a couple of consecutive
RFC numbers to be allocated for these RFCs so that we can start 
using the real rfc numbers instead of xxxx. Or does this have to
wait for Santa Fe?

Bob Smart