[Top] [All Lists]

Re: Problems with RFC-MNEM & RFC-CHAR

1991-07-15 05:17:47
Randall wrote some comments to the RFC-CHAR and RFC-MNEM drafts
and he claims that they are current. So I will address them accordingly.

1) I think that the current STATUS OF MEMO section in the draft
  for RFC-MNEM is not well phrased and could lead to undesirable
  confusion among the readers of the memo.  I propose the following
  changes to clarify the scope and purpose of the proposed RFC.

  "This supplements the set defined in [1], and in RFC-XXXX [2] where
the format specified in this memo is known as Text/Quoted-Readable."

  "This RFC specifies character set conversions between a User Agent
and its Mail Transfer Agent.  The reference to character sets here is
not intended to encourage the use of unrecommended character sets in
Internet mail.  RFC-XXXX defines which character sets are standard 
in Internet mail, the names used for those character sets in Internet
mail headers, and the definition of the headers and body for Internet
mail messages.  RFC-XXXX also makes the Internet Assigned Numbers
Authority responsible for any additions to the recommended character
sets and their names."

RFC-MNEM currently states:

     this memo specifies ways of doing character set conversions,
     it is strongly advised not to use other character sets than
     ASCII or its proper subset ICS for general Internet use with
     the specifications in this memo.

I would be happy to change "strongly advised not" to "not allowed".

I will add wordings on that the other RFCs specify what is allowed
as described below.

I have told Randall that the RFC does not primarily specify 
conversions, but a message format for transport.

  "This supplements the set defined in [1], and specifies the format
referred to as Text/Quoted-Readable in RFC-XXXX.  In all cases where
this quoted-readable format is used in Internet mail, messages using
this format must be RFC-XXXX compliant."

I state that the RFC-MNEM is "compatible" to RFC-XXXX, is that not enough?
If we should work further along this line I would rather
have any wordings that are non-compliant with RFC-XXXX pointed out and
corrected, or just have RFC-MNEM incorporated in RFC-XXXX - The technical
part of the text is less than 2 pages, and it is referenced anyway in RFC-XXXX.
I really do not see any conflict with RFC-XXXX.

Of cause I want the reference in RFC-XXXX changed from Quoted-Readable
to Mnemonic, as Quoted-Readable has been seen to be mixed up with
quoted-printable in some instances.

2) RFC-CHAR refers to the ISO/ECMA registered version of ASCII which
  is reportedly not identical with the current definition of ASCII
  (ANSI X3.4).  The Internet standards require ANSI X3.4 rather than
  the ISO/ECMA version.  RFC-CHAR should be modified to use ANSI X3.4
  as the reference and conform to X3.4 in its table of characters.
  To do otherwise creates needless incompatibility with existing 
  conforming implementations.  RFC-XXXX has already settled this
  issue in favor of being 100% backwards compatible with existing
  conforming implementations.

I have asked Randall to identify the difference between the
information that I have got and what he thinks it should be.
To my knowledge there is none. If there were, it would
create a major havoc in the character set world. See also John
Klensins comments. 

If there really were a difference between current ASCII and what I have
recorded, I would be happy to change the specification accordingly.

3) The phrase "communication character set" is unclear.
  It would be clearer if changed to read "mail transport character
  set".  This would also help make RFC-MNEM more consistent
  with current UA/MTA terminology.

I had changed that in most instances in the current draft, and
I have now changed the last two instances of the phrase for the
forthcoming draft.

4) The example header refers to "ISO_8859-1" while RFC-XXXX uses 
    "ISO-8859-1".  RFC-MNEM should change to use the format and
   syntax specified in RFC-XXXX.  

As John has pointed out "ISO-8859-1" is not consistent with 
ISO naming. I think we should harmonize the character set names
on some sound ground, as John also has pointed out.

5) RFC-CHAR should make very explicit that it is not specifying the
  character sets or character set names for use in Internet mail.
  The current wording will cause confusion amongst the readers.

I have added a phrase in the forthcoming RFC-CHAR draft:
RFC-CHAR does not indicate anything
about the validity of using these specifications in any Internet
standard, but you should consult each individual Internet standard
to see which character sets and names are allowed.

    The IETF working group has reached a consensus that the
  proliferation of character sets is undesirable and RFC-XXXX has
  specified a small well-chosen set of character sets to be used
  in standard Internet mail.  

True. I support that. Still we have all these character sets
and we need to know how to convert those into the specified
set that can be used for general use as specified in RFC XXXX.
I see no conflict here.

RFC-MNEM actually only allows ASCII and its true subset ICS for
mail transport.

    RFC-XXXX also requires that the existing standard "X-something" 
  format be used for those character sets defined by local agreement 
  and that the IANA is the sole authority for adding all other charset 
  names to Internet standard status.

I think this is well defined in RFC XXXX, and this is the place to
define it. RFC-CHAR does not define which character sets are
allowable in RFC-XXXX. As noted above there will be a general statement
on that in the new draft.

I would be unhappy about specifying non-allowable RFC-XXXX names
with an "X-" as the RFC-CHAR could be very usable in other Internet
standards, eg. Internet OSI specifications. Then it would be ugly
if the OSI RFC would have to use names like "X-" for something
that would be perfectly valid in their RFC. I would rather advice RFC-XXXX
to advice users of the "X-" convention to use the names whenever possible
from RFC-CHAR with a prepended "X-".

6) It would be desirable for all references to "ASCII" in RFC-MNEM and
  RFC-CHAR be changed to "US ASCII" so that people outside the US who
  are accustomed to referring to ALL 7-bit character sets as "ascii" in
  common usage do not inadvertently misread the content of the RFCs.

  I think it is important for all RFCs to be clear and unambiguous and
  to actively try to prevent confusion from arriving.  This is one area
  where the IETF has historically done a good job (in contrast to other
  standards groups).  Referring to ISO standards for the sake of not 
  referring to the more correct ANSI standards is counter-productive.

I have told Randall that I would be happy to do that, provided that
US ASCII will be put into one token, like "US-ASCII". 
But then I also share Jonh's view that ASCII is ASCII and that US
is superfluous. Anyway this needs resolvement of the naming,
as I certainly would prefer the naming of each character set
to be one token.

7)  It isn't clear to me that "quoted printable" is useful for many
  non-European languages because many glyphs cannot be usefully
  represented using strings of US ASCII.  I'm not sure that this is
  fixable, but I'm concerned that we not end up being Euro-centric.
  We should in fact try to address the non-European concerns 
  (Chinese, etc.) as well.

I share your concerns about quoted-printable. I even think that
in Europe qouted-printable will be a nuisance, reading all these
hex values will not make life easy.  We have a discussion on some
news group on 8-bit *news* and that conversation is partly carried
out in stripped 8-bit (which is like quoted-printable, but more
compact). The last stripped articles I almost gave up reading
as they were too ugly to read. Other persons had comments to the
same effect. Quoted-printable will be worse. 

You did not mean "quoted-readable", did you?


<Prev in Thread] Current Thread [Next in Thread>