ietf-822
[Top] [All Lists]

iso-2022-jp

1993-03-08 05:27:09
BTW, your character specifications were interestingly displayed by my UA,
apparently these were Kuten bracketed by control characters?

Not quite, but you're very close.  The "Kuten" are the row and column
numbers of the 94x94 table that is partially filled with Kanji, Kana
and so on.  So, the 2nd character is in column 2 of row 1 (i.e.  the
"ku" is 1 and the "ten" is 2).  When used in a 7-bit form of ISO 2022,
you add 32 to each number.  So in this case it becomes a septet with
value 33 followed by a septet with value 34.  (And when used in the
TCP/IP suite of protocols (including SMTP and RFC 822), each septet
generally becomes an octet, with the additional "high" bit cleared to
zero.)

The control character that you saw is an ESCAPE (0x1b), which is the
first byte in the escape sequences that indicate a character set
switch.  For example, the 3 bytes ESC $ B indicate that JIS X 0208
characters follow.  Some people say that "you switch to Kanji and then
return to ASCII", but this is not true.  ISO 2022's escape sequences
are more like GOTOs than subroutine calls (if we take the programming
language analogues).  So, your use of the word "bracketed", while
appropriate for the typical iso-2022-jp message, is not quite true of
general ISO 2022.  (Assuming that "bracketed" means e.g. [...].)


Erik


<Prev in Thread] Current Thread [Next in Thread>
  • iso-2022-jp, Erik M. van der Poel <=