CRLF (was: Re: A modest proposal)



--On Wednesday, January 23, 2013 06:15 +0000 John Levine
<johnl(_at_)taugh(_dot_)com> wrote:

Additionally, I can't understand why each line is terminated
with <CR><LF>, why use two characters when one will do.


Microsoft-OS text editors. Seriously.


My, what a bunch of parvenus.  SIP got it from SMTP, SMTP got
it from Telnet.  Back in the 1960s we all used <CR><LF>
because on a mechanical model 33 or 35 Teletype, CR really
returned the carriage, LF really advanced the platen, and you
needed both.  I first ran into CR/LF on a PDP-6 in about 1968,
but I know it wasn't new then.
...


Right.  And the relationships between the standards and the
equipment was a little obscure (at least to me) in terms of what
came first.  But having CR as an unambiguous "return to first
character position on line" was important for overstriking
(especially underlining) on a number of devices including line
printers as well as TTY equipment.   A number of systems of the
mid-1960s even had device drivers that would canonicalize lines
with non-destructive backspaces into lines with CR or vice
versa, depending on the needs of the particular device.

The first version of ASCII was very specific that CR was "move
to first character position on current line" and LF was a
vertical index function.  Then the confusion began with some
company (where I was at the time, the finger was pointed at
Digital Equipment, but I have no first hand information or
documentation) who started using CR as "new line".  That left us
with systems in the wild that used LF (only) internally, ones
that used CR (only), ones that used CRLF (always the safe choice
because it worked whether LF was an index function or a new line
one), and equipment that often required the latter.  Subsequent
versions of ASCII solved the incompatibility the same way the
FORTRAN standard (this was a bit before _that_ standards
committee decided the language was named "Fortran") solved
zero-iteration loops -- by fuzzy language that made both
one-traversal and zero-traversal valid in the FORTRAN case and
that made either interpretation of LF valid.

A single LF to start a new line arrived with the model 37
Teletype, and Unix was, as far as I know, the first system to
use just \n as the line terminator starting in the early
1970s.


IIR, Multics from several years earlier.  I'd have to dig
through old manuals to remember what CTSS did, but that system
(and the IBM Model 1050 and 2741 devices often used as terminals
with it) were somewhat pre-ASCII (and long before ECMA-48/ ANSI
X3.64 and the VT100 and friends)  and, IIR, sent and received
shift and rotate codes rather than what we would normally
consider character codes today.  The character codes were just
input to device drivers that dealt with device characteristics

 But by then the Arpanet protocols were designed, and
the hassle of changing wasn't (and still isn't) worth it.


Again, the key useful property of CRLF on the wire is that the
combination is fairly insensitive to the various interpretations
of CR and LF.

    john