Instead, it should allow the document(s) that define a
charset to use whatever definition of the term "character" they want.
That's nothing.
In the world of networking, striving for interoperability is of utmost
importance.
That's why we need enough profiling information to be able to produce
unique glyphs.
Interoperability is of utmost importance so that we should be able to
interoperate only with "charset" information.
BTW, your example of two glyphs of 'a's are identified to be equal by
native English users so that the mapping is, in some sense, unique.
If a sender uses "e" followed by acute, and the
receiver's software cannot cope with that representation, then we have
an *interoperability* problem.
But that would not be MIME's fault. That would be the fault of the
document that defines the Unicode/10646-based charset.
What? "e with acute" has noting specific to do with Unicode. 8859 has one.
And, a receiver with mere ASCII termcal can not display "e with acute",
of course.
That's not the fault of MIME, Unicode/10646 nor 8859.
But, if the definition of "charset" were so ambiguous that with some
charset, the sender could think some code represent "exclamation mark"
while the receiver could think the code represent "or sign", that would
be the fault of MIME to allow such a damned definition.
So as Indic or Han unification.
That's why Unicode/10646 needs profiling.
So, I suggest the following prose:
The terms "character set" and "charset", where used in this document,
refer to an algorithm for converting between octet streams and
character sequences. Definitions for the term "character" are found
in the documents that define each charset.
From the view point of multimedia processing, how good is a character,
if you can't display it?
Characters are the visual media for displaying, aren't they?
Masataka Ohta