ietf-822
[Top] [All Lists]

Format=Flowed and CJK

2000-06-11 15:09:13
Currently, the F=F spec does not allow for languages where spaces are not normally used.

Not using F=F causes just as bad problems with poor line wrap and quotes in Japanese and other such languages as in English, I'm told.

There is a proposal to update the spec as follows:

1. On generation, add SP CRLF where a soft line break is desired following or preceding a CJK character.

2. If a space exists at the point where a soft line break is being added following or preceding a CJK character (i.e., the user typed a space) add an extra space: SP SP CRLF.

3. On receiving, if you see 1*SP CRLF, and the character before the space(s) is CJK, or the character following the CRLF is CJK, delete one SP as well as the CRLF.

The problem with this is that when sending to current F=F clients, a space will be left between characters, which looks bad. When sending to non-F=F clients, some lines will end in SP CRLF, which is not much of a problem.

A more fundamental problem is how to define CJK characters.


--------------------

One alternate suggestion is to add TAB CRLF whenever a soft line break is desired at a point where no SP exists. On reception, the TAB CRLF sequence is deleted.

When sending to all older clients (both non-F=F and current F=F) some lines will end in TAB CRLF. This is worse than SP CRLF, since TAB is not always treated as white space. TABs may cause output to be double-spaced in some situations. I'm told that stray tabs in Japanese text is even worse than stray spaces.

A current F=F client could generate text with a TAB CRLF sequence at the end of a line, causing the words on either side to be run together if received by a new F=F client.

Some of these problems could be avoided by using a different Format value (e.g., "Format2") or an additional parameter (e.g., "Format=Flowed; Flowed-Version=2").


--------------------

A third suggestion is to use SP SP CRLF whenever a soft line break is desired at a point where no SP exists. On reception, the entire SP SP CRLF sequence is deleted.

When sending to current F=F clients, two spaces will be left between characters, which looks bad where spaces are unnatural.

When generating F=F, clients must be careful to not insert a soft line break at a point where the user wants two spaces (for example, following a sentence-ending period) because the end result would be only one space or no spaces, depending on which sequence was sent.

In addition, if a current F=F client sends text which is received by a new F=F client, the sequence SP SP CRLF could occur between words, resulting in no spaces at all between the words.

Some of these problems could be avoided by using a different Format value (e.g., "Format2") or an additional parameter (e.g., "Format=Flowed; Flowed-Version=2").

There is some additional complexity in counting trailing spaces on a line.

The current spec says any number of spaces at the end of a line mean the same thing.
--
Randall Gellens
Opinions are personal;    facts are suspect;    I speak for myself only
-------------- Randomly-selected tag: ---------------
I love the smell of excessive quoting in the morning.
                                        --Steve Dorner

<Prev in Thread] Current Thread [Next in Thread>