ietf-smtp
[Top] [All Lists]

Re: [ietf-smtp] Stray <LF> in the middle of messages

2020-06-08 09:04:22
Hi Leo,

Here is quick summary for the three main operating systems. When it comes to the EOL (End Of Line) terminator of stored text files or messages:

o The *nix (unix-based) OSes use the <LF> (Linefeed) as an EOL terminator,

o The Apple Mac OS, use the <CR> (Carriage Return) as an EOL terminator, and

o The DOS/Windows OSes, used the <CR><LF> (Carriage Return, Linefeed) as the EOL terminator.

The Internet Mail format officially starting with RFC 822 selected the DOS/Windows format <CR><LF> format.

Why?

Maybe because at that point in time, Microsoft had owned 90% of the growing Personal Computer (PC) market. The Mac was still considered (legally) a luxury commodity (otherwise their anti-trust status would no longer apply), and *nix was still mostly at the IT networking level.

But I think there may had been other technical reasons. Dave Crocker the editor of the RFC822, can maybe tell us if X.400 used the <CR><LF> format. I did work on X.400 mail when I worked at Big Circle W (Westinghouse) but I don't recall the format it used.

The point here, you need to keep the above differences in mind, when it comes to exchanging files or data in a heterogeneous network of three different text-based storage or transmission formats.

While the *nix or Mac may store the email in their native format, when it comes to SMTP transmission of the DATA payload , they MUST translate it to a <CR><LF> format. In principle, all compliant SMTP senders and receiver MUST conform to the <CR><LF> end of line terminator.

It is great to see developers do their own thing, even "reinvent the world" rather than be reliant and dependent on other popular systems. It is a good way to learn.

Good Luck with your SMTP server project!

--
Hector Santos,
https://secure.santronics.com
https://twitter.com/hectorsantos



On 6/6/2020 1:06 PM, Leo Gaspard wrote:
Hello world,

I am in the process of writing an SMTP server, which obviously is going
to be the best of all SMTP servers ever written and that will ever be
written in our eon.

However, in the process of taking over the world, I am facing something
that surprises me.

I read, in RFC5321, §2.3.8, this paragraph:

Lines consist of zero or more data characters terminated by the
sequence ASCII character "CR" (hex value 0D) followed immediately by
ASCII character "LF" (hex value 0A).  This termination sequence is
denoted as <CRLF> in this document.  Conforming implementations MUST
NOT recognize or generate any other character or character sequence
as a line terminator.  Limits MAY be imposed on line lengths by
servers (see Section 4).

Which appear to clearly indicate that <LF> is not a valid line
terminator.

However, I notice that every single time I have tried to use `netcat` to
send emails for demo purposes, it succeeded *without* sending <CRLF> and
by sending only <LF>. While `telnet` does appear to convert typed <LF>
into <CRLF>, it looks like (my version of) `netcat` does not. So most of
the SMTP servers I have met with appear to consider <LF> as a valid line
ending.

This, in most cases, is not a big deal, because <LF> is not a valid
character in SMTP commands, so saying that receiving an <LF> is
equivalent to receiving a <CRLF> is not that big a problem.

However, there is one case where the semantics is important: should one
escape the <LF>. sequence while in a DATA block?

I would guess that the fact that other SMTP servers appear to usually
accept <LF>.<LF> as a terminator indicates that <LF>. should be escaped
even though it is not strictly conforming with the RFC, but… I wanted to
have the opinion of other people on this, before diving too deep in the
implementation?

The following paragraph also makes me wonder:

In addition, the appearance of "bare" "CR" or "LF" characters in text
(i.e., either without the other) has a long history of causing
problems in mail implementations and applications that use the mail
system as a tool.  SMTP client implementations MUST NOT transmit
these characters except when they are intended as line terminators
and then MUST, as indicated above, transmit them only as a <CRLF>
sequence.

Should I understand this paragraph as meaning that if I ever receive
such an ill-formed message, I… can? should? must? accept it and… can?
should? must? convert the <LF> into proper <CRLF>?

Thank you in advance for any thoughts you may have!
   Leo

--
Hector Santos,
https://secure.santronics.com
https://twitter.com/hectorsantos


_______________________________________________
ietf-smtp mailing list
ietf-smtp(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf-smtp

<Prev in Thread] Current Thread [Next in Thread>