ietf-822
[Top] [All Lists]

Re: charsets and glyphs

1993-02-19 09:29:19
In the best of Internet traditions, let's defer all policy decisions and
statements until later, when there's some operational experience (to say 
nothing of actual documents) to build on. If you want to see something
considered, write it up without all the hyperbole and we'll see if it flies.

I agree.



But the problem is, you can say the same about the charset definition
and guidelines.  They haven't been written up yet, and we don't know
if they will fly once they *have* been written.

For that matter, we don't even know whether the other parts of MIME
are going to fly.  But we have put MIME on the standards track and we
are confident about it because it is designed to be flexible.  We only
set up a framework, with a few basic content-types, and we permit
others to define subtypes in external documents, at a later date.

I believe that we need to make the charset guidelines flexible too.
For example, we need charset versions for the same reasons that we
need text subtype versions.  I.e. for text subtypes, we might say

    Content-Type: text/richtext; version=2

Similarly, for charsets we may want to be able to say

    Content-Type: text/plain; charset=foo charset-version=2

Or should we just have a new charset name, such as foo2?  But if we do
that, then the installed base of foo software won't recognize "foo2"
and would simply dump the text as if it was ASCII.  This would be a
step backwards.  Of course, the original document that defines "foo"
could specify that future versions will be named "foo2", "foo3" and so
on, and then register a handful of those with IANA at the same time so
that nobody else uses them.  But this doesn't strike me as being
particularly elegant.  Then again, elegance doesn't matter;
interoperability does.  Hmmm...  what do *you* think?

Re: "the interpretation of each byte cannot be questioned", we have
seen the ISO 646 example, where curly braces are curly braces when
they appear in C programs, but are European characters in other
situations.  This argument could even be stretched to the Han
unification case.  I.e. people like Masataka would argue that Unicode
Han characters can be displayed in several different ways, and that
therefore Unicode cannot be a MIME charset.  Clearly, we need to draw
the line somewhere (and I'm arguing that that line should not be drawn
by MIME -- it should be left up to the document that defines the
charset (and then we'll see if it flies)).


Erik


<Prev in Thread] Current Thread [Next in Thread>