ietf-822
[Top] [All Lists]

Re: Format=Flowed/RFC 2646 Bis (-02)

2003-11-05 21:54:13

I have to say that I find both RFC 2646 and this draft fairly opaque and
somewhat ambiguous.

In particular, what is a "line"?  is it:

- zero or more characters from the canonical form of a body part,
  beginning either at the start of the body part
  or immediately following a CRLF, and ending with a CRLF?

- zero or more characters from the encoded form of a body part,
  beginning either at the start of the body part
  or immediately following a CRLF, and ending with a CRLF
  whether or not it is preceded by a SP?

- zero or more characters from the canonical form of a body part,
  beginning either at the start of the body part
  or immediately following a CRLF, and ending with a CRLF
  that isn't preceded by a SP?

In a charset that isn't compatible with ASCII, are the characters
">", SP, CR, LF treated specially using the values of those characters
from that charset, or are the octet values 0x3E, 0x20, 0x0D, 0x0A, 
treated specially?  does the answer depend on the format in which
the message is stored?  (e.g. if the message is stored in a file
on a system whose native charset is ASCII compatible, line endings
in the storage format might still be a combination of CF and/or LF,
but they will have no significance for the canonical form of the 
text at all, since that will be UTF-16, EBCDIC, whatever.)

Here's a stab at defining this more succinctly and precisely.
(or perhaps, it's an indication of how much I misunderstood the
draft...)

If the format= parameter is set to "fixed" or the parameter is unspecified,
text/plain is to be interpreted per RFC 2046.

If the format= parameter is set to "flowed", text/plain is to be interpreted
per RFC 2046, with the following exceptions:

1. The sequence SP CR LF from the canonical form of the body part is to 
be treated as follows:

a. if the delsp= parameter is set to "yes", the sequence SP CR LF is to
be ignored when displaying, printing, or otherwise presenting the body part.

b. if the delsp= parameter is set to "no", or the delsp= parameter is 
unspecified, the sequence SP CR LF is to be treated as SP when displaying,
printing, or otherwise presenting the body part.

c. regardless of the value of the delsp= parameter, if the format=
parameter has a value of "flowed" the sequence SP CR LF is not treated 
as a "line break".  (this changes the rule in section 4.1.1 of RFC 2046
which states that CR and LF are forbidden outside of line breaks)

2. The sequence CR LF from the canonical form of the body part, when 
immediately preceded by SP, is interpreted as a line break. 

3. A "line" consists of zero or more characters which start immediately at
the beginning of the canonical form of the body part, or immediately following
a line break.

4. "Lines" in body parts for which format=flowed MAY be "wrapped" as necessary
to fit the width of the display or output medium, by ceasing the output of
characters along one horizontal row of the output device or medium, and
continuing the output of subsequent characters along the next horizontal row
of the output device or medium.  Such wrapping SHOULD, when possible, be done
when a character sequence that is to be interpreted as SP is detected (either
a SP character, or if delsp=no or is unspecified, the sequence SP CR LF)

5. One or more ">" characters at the start of a line are taken as an indicator
that the text on that line are a quotation.  The greater number of ">"
characters, the greater the "depth" of the quotation.  

6. User agents MAY display or present quotations using leading ">" characters
or in any other manner which is suitable for the output device or medium.  If
">" characters are used to indicate quotations for display or presentation,
the number of ">" characters displayed SHOULD equal the number of ">"
characters at the beginning of the line in the canonical form, if the display
reasonably permits this.  If some other means is used to indicate quotations
in the display or output medium, different levels of quotations SHOULD be
displayed or presented differently, so they can be distinguished by the
recipient.

7. Since the ">" notation applies to the entire "line" (as defined in #3 above),
when a quotation line is "wrapped", the entire line SHOULD be presented as if 
it were a single quotation (and all at the same level of depth), even if the
line is "wrapped" for display or presentation purposes.

8. The vertical spacing between output display rows SHOULD be the same between
rows of characters within a "wrapped" line as between separate lines.

9. In all of the rules in this section, the characters CR LF SP and ">"
have code values as defined by the charset parameter, even if those values
do not correspond to those in ASCII.