Re: The last structural shortcoming of MIME: how to remove it


Olle,

I am very pleased to see your proposal, and strongly endorse it.  However,
I'd like to suggest an editorial change that is motivated by a conversation
I am currently having about how to solve the character encoding
identification problem that holds with URLs in general.

I am currently attempting to find an acceptable method for specifying the
character encoding employed by the octet escape syntax used with URLs.  It
seems like your proposals may provide a possible answer:

Given the following syntax:

quoted-string-with-charspec : '"' %-text-with-charspec '"'
%-text-with-charspec        : charspec %-text
charspec                    : charspec-prefix '<' charset '>'
charspec-prefix             : "=?%"
charset                     : as in RFC 1522
%-text                      : %-octet | %-octet %-text
%-octet                     : unescaped-octet | escaped-octet
unescaped-octet             : octet whose value is the code value of
                              any printable ASCII character other than
                              SPACE or %-specials
escaped-octet               : '%' hex-digit hex-digit
%-specials                  : '"' | '{' | '}' | '|' | '\' | '^' | '~' |
                              '[' | ']' | '`' | '#' | '<' | '>' | '%'

Then RFC 1521 could be updated to read:

value : token | quoted-string-with-charspec

And RFC 1738 could be updated to:

url-with-charspec           : charspec url
url                         : as specified by RFC 1738

Given these changes, url-with-charspec could be used as a parameter value
simply by adding quotes, since url-with-charspec satisfies the lexical syntax
of %-text-with-charspec.

The change to your proposal would be of the nature of adopting a more
specialized syntax for what you are calling "%-encoding applied to quoted-
string".  Doing so would permit string values which are expressable in
terms of a charset specified %-encoded string to be employed as is without
being subject to recursive application of the charset and %-encoding rules.

Regards,
Glenn Adams