Re: Line Wrapping Question

Hi.  I'm responsible for the MIME support in the Netscape 2.0 mail and news
readers.

Pete Resnick wrote:


Let's back up a bit and define the problem. The basic thing that you want
is to deal with auto-wrapped text that the user is giving you. The user is
typing with a composition tool which auto-wraps and they desire the same
effect on the remote end. That is, they type a single pargraph without
hitting the return key, and you want the same or similar auto-wrapping
behavior to occur on receipt of such a message. Specifically, the sender
doesn't care particularly how the line breaks appear on the screen of the
recipient, so long as the text is "wrapped nicely".

There is no way to represent the desire to have auto-wrapped text in
text/plain. Text/plain makes no indication about presentation. And since QP
line breaking is not a presentation mechanism either, that doesn't help
you. So, in order to get the desired behavior, you need to go to another
MIME type.


I think this is exactly right: specifically: if you want display-time
word wrapping, you cannot type your data as text/plain, because text/plain
means explicit line breaks.

The MIME spec does not make this clear, but I strongly believe that the
weight of history forces this interpretation.

In the case of sending mail or news messages into the outside world, you
can't make assumptions about the systems that the readers are using.  Since
it's inception the majority (and until very recently, nearly all) of the
internet/usenet-related news and mail reading community lived in a
less-than-80-columns, explicit-line-break world.  It seems to me, then, that
a program which injects messages into the mail and news streams and which
does not insert CRLFs at the ends of lines is generating bad messages.  (The
"correct" translation would presumably be to insert newlines such that the
wrapping the authoring user saw on their screen is the exact same wrapping
that someone on a PDP-11 reading it with "cat" would see.)  (Whether this is
done by the composer's MUA or composer's MTA isn't particularly relevant.)

Microsoft is generating messages with single-line-paragraphs with
content-type: text/plain and content-transfer-encoding: quoted-printable,
such that the on-the-wire (encoded) message doesn't have long lines (thus
not technically violating any RFCs) but which, when decoded, would become 
unreadable on a system which didn't wrap lines.

Netscape 2.0 is such a system: when confronted with long lines, Netscape
puts up a horizontal scrollbar.  This is arguably not the best thing to
do, since there do exist clients out there which generate messages which
are unreadable in this situation; but my feeling is that such messages are
illegal, and it's more important to support text/plain messages which were
*intended* to have long lines in an unmangled form.  We may add a user 
preference or something to allow wrapping of text/plain messages, but as
Ned said earlier, that tends to mangle more sensible messages.

I (as an MUA author) would like to assume that text/plain, at least in the
context of components of message/rfc822 objects, implies hard-line-breaks -
that auto-wrapping should not occur.

I would like to assume this because I believe it to be the case for the
current majority of such messages in the Internet/Usenet world.

I also agree that messages which say "please wrap me as you see fit" would
be useful - but they should have some other content type than text/plain,
or perhaps a parameter to the type.  I don't think that one can safely
assume that arbitrary text/plain messages are like that.  (text/enriched
is a fine choice for this.  We support display of text/enriched messages.)

Now, Sukvinder still has to decide what to do to allow Microsoft users to
edit messages as they see fit, yet still emit messages which are legible
to the outside world.

I believe that we've got a good, interoperable approach in Netscape 2.0,
so I'll describe that.  I'd be interested in any comments/criticisms of
it.

(As was pointed out earlier, this has absolutely nothing to do with use of
quoted-printable, which is simply an on-the-wire encoding, with no visible
impact on the end user (presuming they are using MIME.)  So I won't talk
about what we do with respect to transfer-encodings, since it's an
orthogonal issue.)

--------------------------
It is a requirement that, in the message composition window, the user not be
required to hit return at the end of every line.  Macintosh and Windows users
are used to system-standard editors which do not require this.  To most
people, the notion that one would have to hit return at the end of every line
is ridiculous: it is just Not How It Is Done on these platforms.

It is also a requirement that, by the time the message is sent out over SMTP
or NNTP, it is not sent with CRLF only at the end of the paragraph: the fact
of life on Usenet is that it is a short-line medium.  Lines should generally
be less than 80 columns wide, with explicit line breaks, unless there is a
specific need to exceed that (like a table.)

So, the Netscape composition window takes the following approach: let the
platform-specific text editor do display-time word-wrapping.  If a paragraph
is typed, and then a word is inserted in the middle of that paragraph, the
rest of the paragraph should re-wrap.  Hitting return is a "hard" line break
that will never be auto-filled.

This makes the editor behave like all the editors users are already used to.

However, when the time comes to deliver the message, Netscape takes all of
the "implicit" line breaks (where words have wrapped because they reached the
right edge of the window) and turns them into "explicit" line breaks.  In
this way, the message which is sent over SMTP/NNTP is formatted in such a way
that a recipient using a fixed-width non-auto-filling display device will see
exactly what the author has typed, with line breaks in the same places.

There are several problems with this.

The first is that, when quoting messages using the conventional USENET style
(to set off each quoted line by preceeding it with "> ") the line lengths
tend to get longer with each subsequent quote.  It is common, then, for lines
to exceed the width of the window.  When seen in a word wrapping display,
they will often have an unattractive (some would say unreadable) long-short-
long-short wrapping style, like

  > All work and no play makes Jack a dull boy.  All work and no play makes
  Jack a
  > dull boy.  All work and no play makes Jack a dull boy.  All work and no
  play
  > makes Jack a dull boy.  All work and no play makes Jack a dull boy.
  All
  > work and no play makes Jack a dull boy.

It would be horrible were Netscape to send out messages that looked like
this.  Therefore, there is an additional hack in the formatting code which is
that we never wrap lines which have ">" as their first character.  If the
line begins with ">", it is assumed to be a "quoted" line, and its implicit
line breaks are not converted to explicit line breaks.

This is a violation of the otherwise-WYSIWYG nature of the editor, in that
the person composing the message will see alternating long-short lines, but
the person reading the message will simply see some lines which are "long"
compared to the others.

Now, the next problem with this is that we wrap lines based on where the
system's built-in text editor has chosen to do display-time word-wrapping.
This, in turn, is based on the size of the window.

We create all composition windows 72 characters wide, on all platforms.
(The hard requirement is that the windows be 79 characters wide or less,
but it is traditional for the "fill column" to be set to 72 to allow room
for followups to quote this message several times before the lines begin
to exceed 80 characters.)

The problem is this: if the user drags the window wider, we will then wrap at
the current width of the window, which may well be excessively wide.

One solution to this, which we rejected, would be to disallow the window from
being made wider than 79 columns.  That would be bad, because there are
situations when it is necessary and appropriate to send messages with long
lines: when formatting a table, for example.

Another solution would be to pop up a dialog box warning the user that their
window is too wide (probably just before the message is sent) which offered
to shrink it and allow the paragraphs to refill.  We may yet do this, but
rejected it initially as too intrusive and annoying.

Another attractive and often-suggested solution is to cause the editor to
always do word-wrapping somewhere before 79 columns, regardless of the width
of the window.  However, this assumes a much more sophisticated editor: one
with enough composition and formatting commands that the width of each
paragraph could be controlled individually.  There is not a built-in editor
that is that powerful on all the platforms on which we must ship, which would
mean we'd have to write it ourself.  But more importantly, it would then no
longer be "The Builtin Editor", meaning it would no longer be the editor (and
have the behavior) that the user expects.

A simpler version of this hypothetical "more sophisticated" editor would be
one which inserted "hard" newlines when the user typed a word which passed
the right edge.  In this editor, typing a paragraph and then inserting a word
in the middle of it would leave the paragraph with ragged margins.  It would
then need to provide a "refill paragraph" command to discard and recompute
the line breaks of that paragraph.  That's a fine idea, but the builtin
editors don't work this way, and don't provide such a command.  Regardless of
how hard it would be to make them behave that way (and it might be hard, or
it might be easy) the fact remains that we would have changed the behavior of
the editor in a rather fundamental way, way which would surprise users
already accustomed to it from its use in other applications.

Other alternatives are to provide a "ruler" at the top of the window, or to
draw a vertical "line of death" behind the text at 72 (or 79) columns,
reminding the user that they should avoid having lines longer than that.
We tried the "line of death" approach for a while, but most people didn't
understand what it was: they assumed it was some kind of redisplay glitch,
and reported it as a bug!

Also, both of these approaches assume the user notices the subtle graphical
information we're showing them, and that they care/understand that <80
columns is a Good Thing (this isn't obvious at all to novice users.)

Note also, that all of these problems/solutions apply only to the case of
generating messages of type text/plain, which is all that the Netscape mail
composition window can directly do at this time.  If someday the composition
window allows generation of something other than text/plain, for example,
text/html or text/enriched, then the generated messages would contain
formatting information, and these problems all go away.  (Well, get traded
for an orthogonal set of problems. :-))

-- 
Jamie Zawinski    jwz(_at_)netscape(_dot_)com   
http://www.netscape.com/people/jwz/
``A signature isn't a return address, it is the ASCII equivalent of a
  black velvet clown painting; it's a rectangle of carets surrounding
  a quote from a literary giant of weeniedom like Heinlein or Dr. Who.''
                                                         -- Chris Maeda