ietf-822
[Top] [All Lists]

The extent of <nofill> and other text/enriched nitpicks

1996-02-13 03:39:19
While working on BBN/Slate, a multimedia mail system of CMU's Andrew's
ilk, I tried to implement text/richtext input/output converters.  I found
that 1) it had many Andrew-based predjudices in its text model that did
not jibe with the model in either Slate or your typical Mac/Windows simple
text processor and 2) there was an amazing amount of implied semantics
that came out of Andrew's handling of formatting that was not in any way
part of the spec.
[...]
text/enriched is much more tightly specified and much more constrained in
the kind of markup it is trying to provide.

Amen.  I had exactly the same reflection when writing the text/richtext and  
text/enriched converters for NeXT's (RTF-based) mail UA.  The updated  
text/enriched spec is much clearer and one thing I'm particularly happy to  
see explicitly stated is the implied line breaks around paragraph formatting  
commands like <center>.

<nofill> still seems a bit loose, though.  In particular, it's not clear to  
me exactly what the extent of the affected text really is supposed to be.   
From the draft-03 spec, it sounds like it will begin immediately after the  
right bracket in "<nofill>" and continue to immediately before the left  
bracket in "</nofill>".  However, this means that an example like this:

        --------------------------------
        <nofill>
        aaa
        bbb
        </nofill>

        <nofill>
        xxx
        yyy
        </nofill>
        --------------------------------

will generate the somewhat surprising:

        --------------------------------

        aaa
        bbb


        xxx
        yyy
        --------------------------------

since each nofill block will include the CRLF just after <nofill>, the one  
just before </nofill>, and the double CRLF between the two blocks will  
generate an extra CRLF too.

HTML's <pre> command has a special rule for this.  It dictates that any  
directly adjoining newline to the <pre> command is to be excluded from the  
affected text.  With that rule in effect, you'd get this instead:

        --------------------------------
        aaa
        bbb

        xxx
        yyy
        --------------------------------

Which is more in tune with the other commands.  For example, if you have:

        --------------------------------
        <center>
        aaa
        bbb
        </center>

        <center>
        xxx
        yyy
        </center>
        --------------------------------

you currently get:

        --------------------------------
                    aaa bbb

                    xxx yyy
        --------------------------------

if I read the spec right.

On a related topic, the rule that makes a single CRLFs turn into a space  
seems a bit simplistic.  For example, this:

        --------------------------------
        first
        <flushleft>
        <bold>
        second
        </bold>
        </flushleft>
        third
        --------------------------------

will generate this:

        --------------------------------
        first
         second
         third
        --------------------------------

That is, there will be an extra space before both "second" and "third" and  
maybe one after "first" and "second" too.  It would probably be better to  
borrow an idea from the paragraph commands and say something like "a single  
CRLF will cause a space to be produced unless it would mean that it would be  
generated immediately next to another space or newline".

My apologies if I misread something in the spec.

--Lennart
<Prev in Thread] Current Thread [Next in Thread>