ietf-822
[Top] [All Lists]

Re: restrictions when defining charsets

1993-01-28 12:01:38
Masataka Ohta writes:

It was never spelled out in the minutes or anything like that, but
went something like:

From the minutes of November 91 meeting in Santa Fe:

(d) Character set issues

    The Working Group specified the definition of a character set
    for the purposes of quad-x to be a unique mapping of a byte
    stream to glyphs, a mapping which does not require external
    profiling information.

It seems to me that IETF correctly recognizes character set issues.

I believe the minutes are not fully correct wrt. character set
terminology here.

The term "glyph" has a distinct meaning in ISO terminology,
and it is very different form the ISO term "character".
The glyphs are representing the outlook while the character is
representing the meaning. For example the character "a" (LATIN
SMALL LETTER A) may be presented by a number of glyphs:
courier a, Times a, etc, and there is a distinctive difference between
the outlook of the italic Times "a" and the normal Times "a"
the first being round shaped  like "o," and the latter 
having a small round belly below and a horizontal line about
in the middle. They are glyphs of the same character, though.

There is another definition of "charset" found in RFC1345:

   The ISO definition of the term "coded character set" is as follows:
   "A set of unambiguous rules that establishes a character set and the
   one-to-one relationship between the characters of the set and their
   coded representation." and this definition may be subject to
   different interpretations.  This memo does not put further
   restrictions on the term of "coded character set" than the following:
   "A coded character set is a set of rules that unambiguously and
   completely determines which sequence of characters, if any, is
   represented by each possible sequence of n-bit bytes for a certain
   value of n." This implies that e.g. a coded character set extended
   with one or more other coded character sets by means of the extension
   techniques of ISO 2022 constitutes a coded character set in its own
   right.  In this memo the term "charset" is used to refer to the above
   interpretation of the ISO term "coded character set".

Keld

<Prev in Thread] Current Thread [Next in Thread>