I'll make a stronger statement. I'd prefer that the MIME2 not make *any*
recommendations on whether a given charset is appropriate for a particular
use, outside of recommending US-ASCII for when compatibility is desired with
With some reservations, I agree. However, we have to resolve what a charset is
so that people can figure out what to register and how. By my reading of this
list, we have not settled this issue.
The reasons are: (1) this WG doesn't have enough expertise to make such
recommendations, and (2) the "best character set to use" will change from
time to time as people replace their old hardware and software with new
stuff that implements 8859/* or 10646 or whatever.
The issue here isn't one of expertise or choosing the best one, it is instead
one of open-endedness. I think it is quite clear that an open-ended
registration process is needed for character sets, if for no other reason than
to let people register new things for experimentation. What actually gets used
will be determined on the network-at-large; it is just plain silly of us to
think that we'll be able to control this no matter what we standardize. At
best, we may be able to specify and therefore control what canonical forms are
used for a given character set.
I think that there is some desire to block the potential proliferation of a
very large number of character sets. This desire, while laudable, has led us
directly into the quagmire that we're in now. It now appears that no matter
what we do there is going to be some degree of proliferation. We have to learn
to live with this.
The fear that there's going to be a huge proliferation of incompatible
character sets in actual use is unfounded, I think. Interia alone argues
against this -- it takes a lot of effort to write a specification and push it
through the process, to say nothing of the difficulty inherent in getting
people to actually use the thing. The Internet is great at dropping things that
turn out to be excess baggage. This process is a good one and we should let it
work for us.
Dave's "let's move on now" sentiment is also laudable, but I think it ignores
reality. As long as we choose to ignore all aspects of character sets we're
just going to be forced to wallow in all this stuff again and again until we
set up some minimal guidelines.
We have to reach closure on at least some aspects of this debate. I therefore
propose that we do the following:
(1) Come up with a definition for the things we call character sets in MIME
that we register with IANA. This is critical and it has to be something
we can all live with.
(2) Add wording to the effect that beyond US-ASCII the MIME specifications
makes no recommendation for what character sets are appropriate for any
(3) Stop debating the merit or lack thereof of 10646/Unicode on this list.
I think we're all well acquainted with the various sides of this issue and
the discussion has devolved into picking tiny semantic nits out of the
(4) Send the people who are working on the representation of 10646/Unicode in
MIME off to a separate corner where hopefully they'll be able to come up
with one or more draft RFCs on how it should be done. I really think it
is time for some concrete proposals with all the details filled in. (Yes,
I know that various people have posted various pre-draft things describing
this stuff to the list. Frankly, I'm having a lot of trouble keeping them
all straight, and I'm sure others are as well. A set of concrete proposals
would help a lot.)
In the best of Internet traditions, let's defer all policy decisions and
statements until later, when there's some operational experience (to say
nothing of actual documents) to build on. If you want to see something
considered, write it up without all the hyperbole and we'll see if it flies.