ietf-822
[Top] [All Lists]

Re: comments on latest MIME drafts

1995-05-23 13:36:09
Sigh. I'm sorry for pushing on this. As Ned points out,
draft-ietf-822ext-mime-imb-03.txt does include a definition of the
term 'character set':

 The term "character set" is used to refer to a method of converting a 
sequence

    "used" -> "used in MIME"

 of octets into a sequence of characters.  Note that unconditional and
 unambiguous conversion in the other direction is not required, in that not 
all
 characters may be available in a given character set and a character set may

   "available in" -> "representable by"

 provide more than one sequence of octets to represent a particular character.

    "a particular character" -> "a particular sequence of characters"

 This definition is intended to allow various kinds of character encodings,
 from simple single-table mappings such as US-ASCII to complex table switching
 methods such as those that use ISO 2022's techniques.  However, the 
definition
 associated with a MIME character set name must fully specify the mapping to 
be
 performed from octets to characters.  In particular, use of external 
profiling

   "performed from octets to characters" -> "performed"

 information to determine the exact mapping is not permitted.

 HISTORICAL NOTE:  The term "character set" originated in the definition of
 US-ASCII and similar 7bit and 8bit specifications.  These define true sets.
 However, the advent of multi-octet character encodings and switching
 techniques have transformed character sets into entities that properly
 speaking are no longer strictly sets.  Some other communities have adopted
 the term "character encoding" for what MIME calls a "character set" as a
 result.

This isn't a historical note, it's a compatibility note, or just NOTE.
US-ASCII isn't a 'true set' but it is a simple mapping from single
octets to single characters.

And the other encodings haven't transformed character sets but
have just caused some terminology changes in order to make
distinctions that didn't exist before.

How about:

  NOTE: The term "character set" as used originally in MIME arose
  with the use of US-ASCII and other 7bit and 8bit specifications
  which employ a simple mapping from single octets to single
  characters. The advent of multi-octet coded character sets and
  switching techniques has made the situation more complex. For
  example, some communities have adopted the term "character encoding"
  for what MIME calls a "character set", while using the phrase "coded
  character set" to denote an abstract mapping from integers (not
  octets) to characters.