Embedded foreign alphabets

-----------------------------
Application message id:  60149151211991/7993 X400
Grade of Delivery:  Normal

-----------------------------
VMSmail To information: MUVAXA::MRGATE::"mci
_mta::*emsinternet::*mbx1ietf-822(a)dimacs.rutgers.edu::su=MIME"
Sender's personal name: John C Klensin

Quoting david(_at_)eco(_dot_)twg(_dot_)com Sun, 15 Dec 1991 00:24:23 -0800..

 The first
is to use tokens in the pseud-SGML (richtext) to mark
language changes.  A piece of mixed-language text would
certainly fall under the label of "rich text", it certainly
is no longer plain-jane text ;-).

  For whatever it is worth, there is a rather large and
important international project called the Text Encoding
Initiative (TEI) that is concerned with making various literary
and scholarly texts machine-readable in a useful form.  They
have had to deal with embedding of words in different languages
and character sets than the main text body, with embedded
commentaries and asides, and so on.  Their model is, of
necessity, more complex (and qualitatively different) than that
of a business letter with attachments.  And they have had to
deal with "odd" alphabets, at least some of which have not been
on the 10646 agenda.
  What do they do?  They use (real, not pseudo) SGML to identify
these things as embedded generic objects, identifying what they
are and how they are coded.
  Large quantities of implementation experience, in several
countries.   The tagging is a little wordy, but it does work.
Maybe we don't need to invent another wheel?
  References (people and documents) on request when I get back
to the US on Wednesday.
     --john