Re: quoted-printable

Apart from the obvious size problem, hardly any Japanese users have
software that understands Keld's mnemonics. Most of the software
understands the 2022 subset. So Japanese encoded in Keld's mnemonics
would be extremely unreadable.


I do not know of any full mnemonic implementations, so I can understand
that there is no Japanese support..... I am debugging my own full
implementation right now.

Anyway I am on record as advocating 2022 techniques for Japanese
etc, I was the first to speak for that, and the only one for about
the whole spring of 1991 that spoke for Japanese 2022
specifications on this list. And I have been active in requesting
(nagging) the Japanese to write their 2022 spec. I have always said
that 2022 encoding was much preferable to my own mnemonics spec.

It is quite clear that the Japanese will not use Keld's mnemonics for
their usual email. So the question is: What would Keld's Japanese
mnemonics be used for? For use in other countries? Wouldn't this be a
rather minor usage, in terms of volume in characters per day? Also,
wouldn't it be less confusing if Japanese was encoded in one way (i.e.
the Japanese way) instead of two ways?


I see an advantage in usage of japanese mnemonics in the following areas:

1. character set definition - so we have a precise definition
   of the japanese character sets. This is also equivalent
   to how japanese characters were specified in 10646.
2. character set conversion, between eg. 0208 and 10646.
   This is intended for the new equipment speaking 10646
   and networking in 2022 - UA stuff.
3. for stupid foreigners like myself who occasionally
   see a japanese name - we cannot read it but we can see that
   this is something japanese. Probably also UA stuff.

Mnemonic is always only a last-way-out (fallback) notation.
It is always better to have "the real thing" on the screen.

In terms of user acceptance for something like Danish text
I would rate our existing fallback methods:

1. base64   - totally meaningless
2. quoted-printable - you can guess something, but many will
    give up on reading the stuff. This is based on scandinavian
    discussion on bit-stripped mail, and quoted-printable is
    worse, with 3 unintelligble characters instead of one.
3. mnemonic with & as intro-char: you have a good idea what
   is in the text, but the & is disrupting your reading.
   Some people will drop the message.
4. mnemonic with some kind of "invisible" intro-char or
   without intro-char: quite readable, for almost-fluent
   reading. I tried it on french, and I liked it. Some may be
   distracted by the unique mnemonics, e.g. the germans
   like ue instead of u: . I think most people will read thru
   a message.
(5. Not in any current IETF spec: translitterated text.
   Local conventions like German ue for u:, Danish y for u:
   This is even more readable.  Only a few will drop reading it.
   There are actually some Swedes working on something like that for 
   Swedish)
(6. X.408 - maybe in an IETF OSI spec, one-to-one char
   fallback  o: -> o,  a: -> a  ae -> ? - people can guess
   what is in there, and it is not disturbed by funny characters.
   It is better than quoted-printable, but maybe about the same
   level of readability as mnemonic with &. For Danish it is
   bad - showing ? for some frequent danish letters.)

Still, the best thing is to have the real character, and for Danish
I (IMHO) find it very embarrasing to read mail using o/ etc for
{|} (danish 7-bit letters) - I feel they are damaging danish culture.
So it is not just the Japanese who cares about their characters.

If you have additional problems with RFC-CHAR I'd like to hear what
they are. But issues of scope are not a valid area of concern for the
Working Group, in my opinion.


You say (later) that you are reluctant to pursue two mnemonic
approaches at once. In much the same way, I am reluctant to pursue two
approaches for encoding Japanese at once. Since there is already an
established encoding for Japanese, the Japanese mnemonics should be
removed from RFC-CHAR.


Well, I would be unhappy about removing the specification of japanese
character sets from RFC-CHAR. This is because I need it to specify
iso-2022-jp. Then I need a quite precise definition of X0208.
I would be happy to include a statement for iso-2022-jp that
this is the preferred way of encoding japanese text.

Also I would be unhappy about not being able myself to see
some kind of japanese representation on my own equipment (European
PC equipment).

If you could work out your
differences with what Keld has proposed and come up with a unified
result I think we'd all be a lot happier. (I have found that Keld is
more than willing to listen to suggestions on how to modify RFC-CHAR
to make it a better specification.)


Well, I'm sorry to say that I have not found Keld at all willing to
make changes that I propose.


I think this is too negative wordings for our relations. Actually
I cannot remember any technical proposals from you, Erik, that did not
make me change the draft trying to accomodate your viewpoints.
I think you are very skilled and have many constructive comments.
Also I think we have been much in line on this, you have been very
supportive for mnemonics, and I have been constantly advocating
2022 for japanese etc.

The most important disagreement I see is that you want to remove
japanese mnemonics, and then I say that I do not like that, as
I need it to specify iso-2022-jp.

I also feel that this group has basically given Keld the go-ahead to
continue the development of RFC-CHAR, with the stated goal that it
will become a standard.


As far as I can tell, this group has not made any such decision. You
yourself were complaining about the lack of comment on RFC-CHAR a
little while ago. Silence does not mean agreement.


If IETF WG meetings mean anything, we have decided to make RFC-CHAR
a RFC a couple of times. I think there are minutes showing this.

I quite frankly
don't like what I see happening here -- I see a possibility that
RFC-CHAR will be abandoned, and I think this is a huge mistake.


I also don't want RFC-CHAR to be abandoned. I think that it might be
possible to reach consensus on the Latin-1 part quite quickly.

Latin-1 would be only a small fraction of the work, and 
unsatisfactory for the scope I have tried to address.

Keld