ietf-822
[Top] [All Lists]

Re: SIGH! Re: text --> IA5 ?

1991-04-10 20:13:28
What are the differences between X3.4-1968 and x3.4-<current>?
   I've probably got a copy of the '68 version around somewhere, but it 
could take a year to find it.  So this is from memory (and my memory is 
not good about these things).
   Let me try a little history....
   There is a tension in the char set standards as to whether a code 
sequence is mapped onto a concept or onto a specific glyph or symbol 
and, if the latter, how specific?   It explodes into debates about 
whether the EBCDIC "solid vertical line" is a different concrete 
character from ASCII 7/12, |, which is usually stylized as "broken 
vertical line".  Before anyone says "who cares", recall that there is an 
abstraction called "or-symbol" in several programming languages and that 
ISO WGs and then-TC92/SC6 decided in several cases to map it onto ASCII 
(and ISO646) 2/1, !, on the theory that ! looked more like or-symbol 
(solid vertical bar) than | did.  Anyone reading this on a European 
national 646-variant terminal will immediately see the other reason 
behind the SC6 reasoning :-).
  The earliest character standards and, specifically, the earliest 
versions of ASCII, were very much in the "map to concept" camp, and
filled with weasel words and "alternate stylizations".   So, for
example, if you received 5/14, ^, you could display it as "hat" or
"carat", or as "up arrow".   The newest character standards, e.g., 
the ISO8859-n set, 10646, and UNICODE, have tended to map codes onto 
glyphs or onto abstractions that are sufficiently precise as to not make 
much difference.
  The trend in the control characters is precisely the opposite of this 
trend in the graphics.  The earliest standards were quite precise about 
what the controls were expected to do, the recent ones tend to leave 
those specifications for separate standards that don't contain code->
graphic bindings at all.
   There were also a few other things.  For example, in the first 
version of ASCII (I think the '68 version was the second, but it might 
have been the first, and I don't recall when this disappeared), the 
preferred interpretation of 0/10 was "NL" (first character on next 
line") not "LF" (same character position on next line), a similar "first 
character on applicable line" interpretation was applied to 0/11 (VT) 
and 0/12 (NP or FF), and there was very clear language about what 0/13 
(CR) meant.

Are they worth worrying about, iin the case where we might cite
X3.4-<current> instead of RFC822 (or citing what RFC82 cited).
  In my personal opinion, they are not worth worrying about.  With the 
exception of the vertical carriage motion controls (which RFC821/822 
specify themselves anyway--the elegance of requiring CR-LF combinations 
is that they work equally well in "old" LF==NL systems, where the CR is 
just noise and "new" LF==vertical_index systems, where both are needed),
anything that conforms to "today" also conforms to "1968" and should
produce the same page.
  This does impose a slight incompatability in the other direction, but
I think it is insignificant and, morever, people have had years to get
used to it.  In the original ASCII, someone could print solid-vertical-
bar in response to receipt of 7/12.   If one had an ASCII OCR device,
both solid-vertical-bar and broken-vertical-bar would produce 7/12.  
Today, if one took that text and applied, say, an ISO8859-1 OCR device, 
they would produce two distinct codes and, if an ASCII OCR device where 
used, ???.  I don't consider that a big deal and strongly prefer 
reference to current versions of Standards when possible.
  But the history is above, and people should make their own decisions 
about how sensitive they feel.
    --john
-------

<Prev in Thread] Current Thread [Next in Thread>