perl-unicode

Re: ISO 2022

1999-11-02 16:33:39
Bram Moolenaar wrote on 1999-11-03 00:12 UTC:
I'll look around for info on ISO 2022.

All the relevant information is nicely accessible via links from

  http://www.cl.cam.ac.uk/~mgk25/unicode.html

(an easily underestimated compilation of in-depth information ;-)

Especially:

  ISO 2022 (= ECMA 35) is online on

    http://www.ecma.ch/stand/ECMA-035.HTM

  (Like with most ISO standards, refill your caffeine supply well
  before starting to read it. :)

  The ISO 2022 code for announcing UTF-8 is

    ESC %G

  or

    ESC %/G  (for UCS Level 1, i.e. if no combining characters are used)

  as defined in ISO 10646-1/Am.2 section R.6, which is online on

    http://www.cl.cam.ac.uk/~mgk25/ucs/ISO-10646-UTF-8.html

  and UTF-16 is announced via

    ESC %/J  (for UCS Level 1, i.e. if no combining characters are used)

  as defined in

    http://www.cl.cam.ac.uk/~mgk25/ucs/ISO-10646-UTF-16.html

  and there are many others for other character sets:

  The ISO 2375 International Register of Coded Character Sets
  which lists all the ISO 2022 ESC sequences is on

    http://www.itscj.ipsj.or.jp/ISO-IR/

Note that ISO 2022 is a rather comprehensive standard, and that most
implementations pick only a tiny subset of all the ESC sequences that it
provides. For instance, for Unix it makes a lot of sense to lock GL =
0x20-0x7e to remain ASCII and to switch only the GR range (0xa0-0xff).
VT100 and successor terminals implemented some aspects of ISO 2022 for
instance.

Reminder: ISO 2022 is the announe-encoding-in-the-first-bytes system
that we want to get away from with UTF-8. Nevertheless, ISO 10646 allows
of course UTF-8, UTF-16, etc. to be used as yet another encoding within
the ISO 2022 framework. Please don't spend too much time implementing
it. Just make sure that you know it well before you think that the
byte-order-marks are the only encoding announcers on the market.

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>