ietf-822
[Top] [All Lists]

Re: charsets and glyphs

1993-02-17 11:33:52
"glyph" doesn't seem like the right term...but I'm not sure what is.  Maybe:
"...an algorithm for converting an octet stream into characters" is
sufficient.

Another approach: 

* define a "character set" as a set of (description, code) pairs, where the
  "description" says "how to display a character" (without being too
  specific) and the "code" says "how to represent this character in an
  octet-stream".  

* The code has to be unique -- you can't have more than one description 
  for a given code, though a description can describe multiple possible 
  ways to display a character like "broken vertical bar"...it's a fine line.)

* The code also has to be recognizable with no look-ahead.

(One problem with this approach is that it doesn't include code-switching
sequences, so obviously it still needs work.  Charsets with combining
characters could be fit into this scheme, as long as the sequences of
combining characters were always in a  well-defined order, the definition
specified by the RFC that defines the particular charset.)

I would also prefer that the MIME2 document not specifically exclude "bare
10646"...especially since we don't know what IS 10646 will be yet, and
simply because this might be confusing.

Keith

<Prev in Thread] Current Thread [Next in Thread>