ietf-822
[Top] [All Lists]

restrictions when defining charsets

1993-01-21 08:25:11
As far as I remember the discussions at Santa Fe, the group wanted to
have its own definition of character set.
It was never spelled out in the minutes or anything like that, but
went something like:

A set of rules for the interpretation of an octet stream, such that:
- The interpretation of each byte cannot be questioned
- The number of representable characters is limited
- No further parameters need to be parsed to get the complete
  identity of the character set


"Never spelled out in the minutes or anything like that"?  Doesn't
seem like a good state of affairs to me.  If the above is indeed the
intention, it should be spelled out in MIME itself, perhaps in the
part that shows you how to register a new charset.

I should note that ALL THREE of the above rules came as a surprise to
me today (I didn't attend Santa Fe).  The last two seem reasonable,
and I'm willing to agree to them, but the first one is problematic
(and an example of a charset where it is problematic, is iso-2022-jp,
since you can't tell whether a particular byte is the 1st or 2nd byte
of a Japanese character, unless you backtrack to the beginning of the
byte stream, or you keep track from the beginning in the first place).
(Unless I've misunderstood the 1st rule.)

What, exactly, does the first rule mean?  Yes, let's open this can of
worms too.  Why are there so many cans of worms on ietf-822 these
days?  (Coz it's Draft Standard time?  I guess so.)


Thanks in advance for any reply,
Erik


<Prev in Thread] Current Thread [Next in Thread>