ietf-smtp
[Top] [All Lists]

Re: Extraneous CRs in transfer of draft-crocker-email-arch-04.txt

2005-03-31 13:35:46


----- Original Message -----
From: "Dave Crocker" <dhc(_at_)dcrocker(_dot_)net>
To: "Bruce Lilly" <ietf-smtp(_at_)imc(_dot_)org>
Cc: <ietf-smtp(_at_)imc(_dot_)org>
Sent: Thursday, March 31, 2005 2:30 PM
Subject: Re: Extraneous CRs in transfer of draft-crocker-email-arch-04.txt



 If you have a document with CRLF line endings on a UNIX or Linux
system,
 it is likely to be transmitted over the wire with CRCRLF and stored
 (inappropriately) on other UNIX/Linux systems with CRLF endings.

If.

Yes, that is one of a number of plausible scenarios.

However the fact that my windows system sees bare line-feeds from
documents
sent by the ietf server suggest that, instead, bare linefeeds are being
sent,
contrary to internet standards.

So, at least one other plausible scenario is that some of us have software
that treats crlf and lf equally and some of us do not.


No.  There is nothing wrong with your setup (as far as I can see).  It is
all about the interface.

Much depends on your UI (user interface).  <lf> can be translated to
<cr><lf> and vice versa.

When it comes to text file storage,  <lf> is traditionally a "unix platform"
concept.  <cr><lf> a DOS/Windows concept.

Often programmers need to deal with ideas such as "raw vs cooked" mode when
dealing with text based file I/O.  Raw means basically binary I/O.  No
translation.  Cooked means the EOL (end of lines) definition are translated.
The idea is to "cook it" for proper "printer based" output where a <CR>
literally means to move the printer head to the beginning of the line and
<LF> means move the head down one line (next line).  Of course, the console
evolved from a printer concept so the control characters concepts all apply.
The 80 (or 132) characters wide screen evolved from the 80/132 character
punch card, etc.

For Telnet, by default, it depends on how you have it setup.  Typically,
inbound no translation is done. On outbound, it is translated.  For example,
you hit the <ENTER> key, which is a <cr> (move printer head to begining of
line) character,  is sent along with a <lf> (move printer head down one
line).  But that depends on the negotiation.

For FTP,  the clients generallly needs to define the type of data transfer,
binary or text.

With that said, I see nothing wrong with your HTTP response. This is a GET
request and the WEB Server is responsible to provide the proper response.
HTTP requires CRLF delimiters.  It properly used CRLF delimited and in
addition, it properly performed MIME type association of the file extension.
In your case, TXT extension tells the web server (with a Mime Extension
Lookup) to respond with a Content-type: Text/Plain.

So the question might be ask. How is original file stored? and whether the
WEB server is reading the file in raw or cooked mode.  In addition, from the
client standpoint, is it going to do a translation.  A web client, for
example, will typically not do a translation.

In practice, atleast for a Windows based Web Server (like ours),  you read
the file "as is" - in raw mode and dump it out as so.  The reason is
straight forward.  In the HTTP response header, you need to add the
Content-Size: header. It would be extremely rare to have to pre-read the
entire file to count the number of delimiters in order to do an adjustment
to get the final output size.  That doesn't design well in a streaming
environment.

In any case, nothing new here.  The people who have a problem with the file
is most likely accessing it in cooked mode whether there might be a
translation.

Sincerely,

Hector Santos, CTO
Santronics Software, Inc.
http://www.santronics.com
305-431-2846 Cell
305-248-3204 Office
http://www.winserver.com/wcsap (Wildcat! Sender Authentication Protocol)
http://www.winserver.com/spamstats  (WcSAP Anti-Spam Stats)



<Prev in Thread] Current Thread [Next in Thread>