Re: Vestigial Features (was Re: CRLF (was: Re: A modest proposal))



--On Wednesday, January 23, 2013 23:29 +0100 Carsten Bormann
<cabo(_at_)tzi(_dot_)org> wrote:

On Jan 23, 2013, at 20:56, John C Klensin <john-ietf(_at_)jck(_dot_)com>
wrote:

But having CR as an unambiguous "return to first
character position on line" was important for overstriking
(especially underlining) on a number of devices including line
printers as well as TTY equipment.


But John, on a TTY, you have to send CR NUL, or the first
character will smudge. The carriage is still in flight when
the next byte after it is processed, so it better not be a
printing character.


Oh, my.  This is getting to be interesting.  I had no direct
interaction with or insight into the ASA (later ANSI) committee
that did ASCII so, while I heard some stories, I can't do more
than speculate on why they made some of the decisions they did.
But both Multics and CTSS supported a variety of terminal
devices -- hardcopy and CRTs (storage tube and actively
refreshed), wide and narrow, character and line-oriented, and so
on.  Between the mid-60s and the mid-70s, the devices on or next
to my various MIT desks went from 1050, to 2741, to TTY 37, to
Imlac PDS-1D, with some sort of storage tube "glass TTY" in
between.  The various hardcopy devices all had timing
idiosyncrasies until they started acquiring significant local
buffering.  The Imlac was high enough resolution (2000 points
square) that I could put it into 132 character mode, but the
characters, while sharp, got pretty small.  Those devices ran at
transmission speeds from about 10cps up to 19.2kpbs and above.
It is probably also relevant that CTSS and Multics both
preferred "half-duplex" (really non-echoplex) operation in which
there was no expectation that the remote host (or controller)
would echo back the characters that the user typed and that a
few of the relevant devices transmitted line-at-a-time rather
than character-at-a-time.

All of this led, very early, to the realization that one better
have standard coding on the system so that programs that didn't
have to be specialized to the output device for some other
reason (hard to display 2000x2000 point dynamic vector graphics
on a Teletype) operated in a device-independent way with device
drivers that adapted between the standard coding to the
particular terminal.  So, yes, some TTY units needed padding
characters or other delays and sometimes it varied depending on
the relationship between current carriage position and intended
one.  But that was a device driver problem (and lots of the
device drivers ran on peripheral controllers, not the
mainframe)-- no CTSS or Multics program that used the system
device drivers ever had a NUL or character timing as part of its
application-level data stream.  Those device drivers also did a
lot of string canonicalization, eliminating problems with string
comparison that we've reinvented with Unicode (see, e.g., the
PRECIS WG for more discussion).   If the device transmitted
NULs, they would disappear before being delivered to the
application.

(This is actually the real reason for CR LF:
CR needs 200 ms to execute because it takes time to move the
carriage, so you have to fill in a character to waste 100 ms.
So you might as well separate the CR and LF function, which is
what happened. Multiple newlines are sent as CR LF LF LF to a
real TTY because LF *does* execute in 100 ms -- moving the
paper is quicker.)


As I said, I wasn't part of the committee that did ASCII.   I do
know that CR was extensively used, in both our terminal
environment and in a number of ones involving line printers
without LF as a way to arrange overstriking (for bold,
underlining, and composed characters) on line-oriented devices
that couldn't handle backspacing.   Again, that was mostly in
the device drivers: an application program would see
   xxxo<BS>'yyy 
whether that was transmitted or 
   xxxoyyy<CR>   ' 
or, as mentioned earlier, whatever the first-character-of-line
printer control sequence was that would yield the same result.

I don't know the history, but have always assumed that the
experience above was one of the inputs to the development of NVT
and other terminal abstractions on the ARPANET rather that using
an "every machine needs to know the properties of every
terminal" model.  Sadly, we lost both of the people who would
have been most likely to know last year.

Now the reason I'm writing this:

SIP has CRLF because TTY carriages smudged.


SIP has CRLF because every single application protocol we have
that uses character-oriented commands is modeled on NVT, and NVT
uses CRLF.  You can theorize all you like about how NVT got that
way (and might be right although given what I know of the way
things unfolded, I'd be a little surprised).  But it is NVT that
is the issue.

...
CRLF was quite reasonable for Telnet, as this really was about
TTYs. FTP grew out of Telnet, so keeping it there was maybe
...


Others have commented on the above statement, but the culprit is
NVT and the decisions were very explicit, not accidents of "we
did it that way the last time so should do it again without
thinking", _especially_ after Unix became dominant on the
network.

   john