RE: charsets and glyphs

I'll make a stronger statement.  I'd prefer that the MIME2 not make *any*
recommendations on whether a given charset is appropriate for a particular
use, outside of recommending US-ASCII for when compatibility is desired with
non-MIME mailers.


With some reservations, I agree. However, we have to resolve what a charset is
so that people can figure out what to register and how. By my reading of this
list, we have not settled this issue.

The reasons are:  (1) this WG doesn't have enough expertise to make such
recommendations, and (2) the "best character set to use" will change from
time to time as people replace their old hardware and software with new
stuff that implements 8859/* or 10646 or whatever.


The issue here isn't one of expertise or choosing the best one, it is instead
one of open-endedness. I think it is quite clear that an open-ended
registration process is needed for character sets, if for no other reason than
to let people register new things for experimentation. What actually gets used
will be determined on the network-at-large; it is just plain silly of us to
think that we'll be able to control this no matter what we standardize. At
best, we may be able to specify and therefore control what canonical forms are
used for a given character set.

I think that there is some desire to block the potential proliferation of a
very large number of character sets. This desire, while laudable, has led us
directly into the quagmire that we're in now. It now appears that no matter
what we do there is going to be some degree of proliferation. We have to learn
to live with this.

The fear that there's going to be a huge proliferation of incompatible
character sets in actual use is unfounded, I think. Interia alone argues
against this -- it takes a lot of effort to write a specification and push it
through the process, to say nothing of the difficulty inherent in getting
people to actually use the thing. The Internet is great at dropping things that
turn out to be excess baggage. This process is a good one and we should let it
work for us.

Dave's "let's move on now" sentiment is also laudable, but I think it ignores
reality. As long as we choose to ignore all aspects of character sets we're
just going to be forced to wallow in all this stuff again and again until we
set up some minimal guidelines.

We have to reach closure on at least some aspects of this debate. I therefore
propose that we do the following:

(1) Come up with a definition for the things we call character sets in MIME
    that we register with IANA. This is critical and it has to be something
    we can all live with.

(2) Add wording to the effect that beyond US-ASCII the MIME specifications
    makes no recommendation for what character sets are appropriate for any
    given purpose.

(3) Stop debating the merit or lack thereof of 10646/Unicode on this list.
    I think we're all well acquainted with the various sides of this issue and
    the discussion has devolved into picking tiny semantic nits out of the
    various arguments.

(4) Send the people who are working on the representation of 10646/Unicode in
    MIME off to a separate corner where hopefully they'll be able to come up
    with one or more draft RFCs on how it should be done. I really think it
    is time for some concrete proposals with all the details filled in. (Yes,
    I know that various people have posted various pre-draft things describing
    this stuff to the list. Frankly, I'm having a lot of trouble keeping them
    all straight, and I'm sure others are as well. A set of concrete proposals
    would help a lot.)

In the best of Internet traditions, let's defer all policy decisions and
statements until later, when there's some operational experience (to say 
nothing of actual documents) to build on. If you want to see something
considered, write it up without all the hyperbole and we'll see if it flies.

                                        Ned