ietf-822
[Top] [All Lists]

Comments on draft-gellens-format-02.txt

1998-12-03 23:14:06
I think <ftp://ftp.ietf.org/internet-drafts/draft-gellens-format-02.txt>
is a good approach and I'm a big fan of getting this implemented.
However, I find a little of the writing unclear.  Here are some
suggested replacements for the text.  I don't believe they change the
technical content.

I would globally replace TEXT/PLAIN with Text/Plain or text/plain, as it
is not an acronym.  The latter is consistent with usage in RFC 2046, and
looks better.

In section 4, I suggest adding a sentence to the end of the of the
second paragraph, saying: "In this memo, text which fits this
description is defined as 'fixed.'"

In section 4.1, I suggest adding a sentence to the end of the of the
first paragraph, saying: "In this memo, text which fits this description
is defined as 'flowed.'"  In that same paragraph, I would change "The
display shifts to the next line, starting with the word which would not
fit on the previous line." to "That word is displayed at the left margin
of the next line."

I would also add a sentence to the end of the section, saying:

When the text format described in this section is stored on a file
system that supports MIME typing, the text/paragraph type defined in
Appendix A could be an appropriate description of the media type.
However, many mailers incorrectly treat unknown text subtypes as an
attachment, so text/paragraph SHOULD NOT be used for network
communications.  Instead, format=flowed SHOULD be used for those
purposes.

In the first paragraph of section 4.2, change "or forwarded" to "and
forwarded mail".

After the first paragraph of section 4.2, I would add this example:

Example:

This is a comment from the first message to show a 
quoting example.
This is a comment from the second message to show a 
quoting example.
This is a comment from the third message to show a quoting example.
This is a comment from the fourth message to show a quoting example.
This is a comment from the fifth message to show a quoting example.
This is a comment from the sixth message to show a quoting example.
It can be confusing to assign attribution to lines 2 and 4 above.

After the second paragraph of section 4.2, I would add this example:

Example:

This is paragraph text that is
meant to be flowed across
several lines.
However, the sending mailer is
converting it to fixed text at
a width of 72
characters, which causes it to
look like this when shown on a
PDA with only
30 character lines.

I would change the fourth paragraph of section 5 to the following:

A value of Flowed indicates that any line which ends in exactly one
space MAY be treated as a "flowed" line. (As specified in [MIME-IMT],
all line breaks in messages of any MIME "text" subtype MUST be
represented by a CRLF. So, flowed lines are delimited with SP CRLF, as
shown in the ABNF in section 6 below.) A series of one or more such
flowed lines is considered a paragraph, and MAY be flowed (wrapped and
unwrapped) as appropriate on display and in the construction of new
messages (see section 5.1).

I would replace the second paragraph and two lists of section 5.1 with
something more formalized, such as:

A generating agent SHOULD conduct the following steps to convert
paragraph text as described in Section 4.1 (which is the format in which
most mail messages are authored) to text/plain; format=flowed:

A. Select the line wrap length (LWL), which SHOULD be less than 78 (and
preferably 66) characters in length, not counting the CRLF.
B. For each paragraph (i.e., *998<textchar> CRLF),
 1. If the paragraph contains "--" SP CRLF (i.e., Usenet sig lines),
skip the line and go to Step B for the next paragraph.
 2. For paragraphs that consist solely of SP CRLF, replace with CRLF and
go to step B for the next paragraph.
 3. For paragraphs that end <non-sp> SP CRLF, convert to <non-sp> CRLF.
 4. For each paragraph longer than the LWL, find the last 1*SP in the
line that is less than the LWL.
  a. Convert this 1*SP to a SP CRLF, except in the following cases:
   (1). If the character after the 1*SP is a close angle-bracket (">"),
the 1*SP to the left of the previously selected 1*SP should instead be
converted into SP CRLF.  (Otherwise, for receivers that do not
understand format=flowed, it would be ambiguous whether the ">"
indicated quoting or not.)
   (2). If the characters after the 1*SP are "From ", the 1*SP to the
left of the previously selected 1*SP should instead be converted into SP
CRLF.  (This is necessary because some systems incorrectly alter lines
that begin with "From ".)
   (3) If the line consists of a single word longer than the LWL, the
agent MAY either add a CRLF at the LWL to wrap the line or use
quoted-printable encoding to "protect" the long line or leave the line
as is.  As noted below, if quoted-printable encoding is used, it SHOULD
NOT be used to protect the trailing space.
  b.  Reapply Step 4 to the remainder of the paragraph (the part after
the SP CRLF).
 5. Apply Step B to the next line.

[Note that the 1*SP in Step 3 allows lines that end 2*SP CRLF to be
treated as fixed, as per the ABNF (i.e., only SP CRLF indicates a flowed
line).  Is this really necessary?  I normally put two spaces between
sentences.  If the rule were changed to indicate that 1*SP CRLF
indicates a flowed line, then Step 4 could be changed from its current
effect, which is to convert double spaces at the end of a line to single
spaces.  I can't think of any practical cases where a user would need to
preserve the spaces count at the end of a paragraph.  By contrast, the
current formulation of Step 4 can result in losing spaces when the
*middle* of a paragraph falls at the end of a line, where there could be
an effect both on sentence spacing and on table columns separated by
spaces.  If you agree that flowed lines should be indicated by 1*SP CRLF
(as I do), the following rules would be changed:

 2. For paragraphs that end 1*SP CRLF, convert to CRLF.
 (3 can be deleted.)
 (Convert all occurrences of 1*SP in Step 4 and sub-steps to SP.)]

A receiving agent SHOULD conduct the following steps to convert
text/plain; format=flowed to paragraph text.

A. For each line,
 1. If the line contains "--" SP CRLF (i.e., Usenet sig lines) skip the
line and go to Step A for the next line.
 2. Convert <non-sp> SP CRLF into <non-sp> SP.
 3. Apply Step A to the next line.

[If flowed lines are instead indicated by 1*SP CRLF, line 2 would be
changed to: "Convert SP CRLF into SP."]

In section 5.3, it's not clear to me why it matters that some systems
insert ">" after "From ".  What negative effect is being avoided by
using ">" SP as the quote indicator?

"Flowed lines which are also quoted may require special handling on
display and when copied to new messages." should be changed to "Flowed
lines which are also quoted SHOULD be given special handling on display
and when copied to new messages."

In the fourth paragraph, "logical entity" should be replaced with
"paragraph".

The rest of 5.3 needs to be cleaned up a little more.  For instance, if
the rules are followed, an agent can't generate the ambiguous situation
you describe, right?  I think this description would be enhanced with
updated rules as above.  These rules would seem to be for a slightly
different situation, since the generating agent should be reflowing
quoted blocks and adding one level of depth in preparing the reply for
user editing.  If people think it would be useful, I will try to put
quoting rules together in the next couple days.

In section 6, you want to say:

The constructs used in a "text/plain; format=flowed" MIME body part are
described using [ABNF]:

body-part       = *paragraph *998textchar
textchar        = %x01-09 / %x0B-0C / %x0E-7F ; any 7-bit character
except null, CR, and LF

I admit the *998textchar is a little baroque, but otherwise we forbid
text body parts that don't end with a CRLF.  There doesn't seem any
reason to exclude that case, which is of course completely legal in
MIME.  And to disallow the use of CR and LF in lines as required for
text subtypes, you need to replace occurrences of CHAR with <textchar>
in your ABNF rules.

Also, to be compatible with the text subtype, I believe

        non-sp = %x01-19 / %21-7F ; any 7-bit except null or SP

(which also has 19 instead of 1F and %21 instead of %x21) should be
changed to

        non-sp = %x01-09 / %x0B-0C / %x0E-1F / %x21-7F ; any 7-bit
character except null, CR, LF, and SP

I believe you also need to add two extra section:

9. IANA Considerations

IANA is requested to add a reference to this RFC in the text/plain
registration and to register text/paragraph as described in Appendix A.

10.  Internationalization Considerations

The line wrap and quoting specifications of format=flowed may not be
suitable for certain charsets, such as for Arabic and Hebrew characters
that read from right to left.  Care should be taken in applying
format=flowed in these cases, and format=fixed combined with
quoted-printable encoding may be more suitable.

Suggested text for Appendix A:

1. Overview of text/paragraph

The "text/plain; format=flowed" MIME type described above SHOULD be used
to transmit paragraph text (defined in section 4.1) across networks in
order to maximize interoperability.

However, MIME media types are gaining increasing use in other contexts,
such as file systems, where converting paragraph text to format=flowed
may be less appropriate.  For those situations, this appendix specifies
an additional text subtype, text/paragraph.  This MIME type SHOULD NOT
be used for network communications such as [SMTP] (where many mailers
incorrectly treat unknown text subtypes as attachments) and [HTTP]
(where many browsers display paragraphs as long lines with a horizontal
scroll bar rather than wrapping them).  text/paragraph SHOULD only be
used in cases where it will cause no harm or user discomfort.  If there
is uncertainty as to the right approach, use format=flowed.

2. The text/paragraph Media Type

     MIME media type name: text

     MIME subtype name: paragraph

     Required parameters: none

     Optional parameters: charset

     Encoding considerations:

     As the text/paragraph media type is not generally expected to be
used across networks, only identity encoding (i.e., no transformations)
should be necessary.  Thus 7-bit (for 7-bit safe charsets and paragraphs
of under 999 characters), 8-bit (for non-7-bit safe 7-bit safe charsets
and paragraphs of under 999 characters), or binary (for everything else)
will normally be used.

     Security considerations:

     This media type has the same security considerations as text/plain.

     Interoperability considerations:

     text/paragraph has serious interoperability problems for most
protocols used to transfer MIME body parts.  With SMTP, many mailers
incorrectly treat unknown text subtypes as attachments, obscuring the
message.  With HTTP, many browsers display paragraphs as long lines with
a horizontal scroll bar rather than wrapping them.  Therefore,
text/paragraph SHOULD NOT be used for network communications.  Instead,
the generating agent SHOULD convert the text/paragraph content into
text/plain; format=flowed as described in Section 5.1 above.  This will
allow seamless conversion back into text/paragraph by the receiving
agent if so desired.

     Published specification: this document.

     Additional information:

     Magic Number(s): none

     File extension(s):
       No clear distinction is made between text/plain and
text/paragraph
       for file extensions.  The ".txt" extension is used for both (this
       internet draft is text/plain).  The ".asc" or ".ascii" extension
is
       traditionally used for text/plain with the US-ASCII character
set.  However, some applications have two text options, "Text Only
(*.txt)" (which corresponds to text/paragraph) and "Text Only with Line
Breaks (*.txt)" (which corresponds to text/plain; format=fixed).

     Macintosh File Type Code(s): TEXT
       NOTE: the TEXT file type on MacOS is used for both text/paragraph
       and text/plain; format=fixed.  The system application
"SimpleText" generates
       a text/paragraph file with type TEXT (which also may contain
       MacOS-specific out-of-band markup in the resource fork).  Text
       editors such as those that come with compilers use text/plain;
format=fixed.

     Intended usage: LIMITED USE



If you use this Appendix, you should credit Chris Newman and Ned Freed
for part of the text.  And, RFC 2068 needs to be referenced as [HTTP].

<Prev in Thread] Current Thread [Next in Thread>