[Top] [All Lists]

Re: MIME's "Content-Disposition" Header

1995-01-18 09:31:39
paf(_at_)bunyip(_dot_)com (Patrik Faltstrom) writes:

As we already have support for 1522 en-/decoding in the software, what
is the problem with having 1522 encoding of this field also? Isn't that
the easiest thing to implement?

As Keith Moore, the author of RFC 1522, has stated himself, in 

: Encoded-words were designed to solve a problem with a very narrow
: solution space -- encoding of the human readable text portions of rfc
: 822 message headers.  They are okay for that purpose, but I hate to
: see them crop up everywhere, for the following reasons:
: + They were also designed only for "free-form" text, not to
:   transparently convey arbitrary data.  For instance, there aren't good
:   rules for when an encoded value begins and ends.
: + There are complicated rules that say when certain characters can
:   appear unencoded in certain contexts (in a phrase, in *text, or in a
:   comment).
: + The RFC on encoded-words is written in terms of how encoded-words
:   are to be *displayed*, and not as a mapping between an unencoded
:   string octets and and encoded one.  So there is inherent ambiguity,
:   for example, in how to treat white space between adjacent encoded-words
:   when they appear in a parameter.
: These rules have a large potential for being misunderstood or
: mis-implemented.  If this only affects how message headers are
: displayed, that's not a big deal [...]
: So I would prefer that the use of encoded-words be confined to
: free-form text fields.

As an example of the complexities of the RFC 1522 encoding,
consider these examples of equivalent strings in a RFC 822
parenthesized comment in a message header:

String 1                              String 2 Comment
--------                              -------- -------
(=?US-ASCII?Q?a?= b)                  (a b)    Significant SP must follow
                                               before ctext
(=?US-ASCII?Q?a?= =?US-ASCII?Q?b?=)   (ab)     SP between encoded-words
(=?US-ASCII?Q?a?=  =?US-ASCII?Q?b?=)  (a  b)   multiple SP's don't collapse
                                               to one SP in comments; first
                                               encoded-word followed by a
                                               ctext SP; next SP is ctext
                                               before second encoded-word
(=?US-ASCII?Q?a_b?=)                  (a b)    only way to encode a single
                                               SP between two encoded words

If RFC 1522 encoding is to be used in parameter values, a new
"profile" of the general encoding must be done for this new
application. Also the specification must be reworked and made
more precise. There is no full EBNF specification in RFC 1522
for instance.

Steve Dorner has stressed the urgency of finalizing the
Content-Disposition: document. I think it is better to use the
short time available for specifying decent support for
"un-American" character sets in the Content-Disposition: header
by designing a simpler and cleaner variant of Quoted-Printable
and Base64 to be used in parameter values, that is not afflicted
with all the intricacies of RFC 1522 encoding. (Stay tuned for
my suggestion for such an encoding.)

I also must say that I don't like the "portable filename" thing. My view
is that it's the receiving mailer which should filter the filename and
create a new one which suits the operating system it runs on.

I agree that the receiver's UA should always filter suggested
filenames according to local rules. I wouldn't expect to good
results from automatic filename normalization functions in real
life, though.

Suppose the incoming message contains three body parts with the
filename parameters
A MS-DOS UA will probably transform these to
At least this is what MS-Kermit does in a similar situation.

These MS-DOS filenames are not very descriptive and, worse,
which filename corresponds to which of the body parts is wholly
dependent on the _order_ in which the body parts are saved.

A _careful_ sender could provide much better portable filenames,
such as

The Portable-Filename parameter would of course not be
mandatory. I'm also convinced that most users will never bother
about constructing portable filenames for their attachments, the
local filename in there own system will most likely be used in
the message too. But for conscientious senders, composing
importaant messages that they expect will be read by many
persons they don't know, using varying computer types and
operating systems, a Portable-Filename parameter would be
useful. I don't see how its inclusion in the
Content-Disposition: specification could hurt anyone.

I don't want to say this, but personally I think that if you can not
have national characters in the headers, ala 1522 or whatever, that
software/functionality of MIME will not be accepted in sweden...

I also substantiate this claim. Swedish can't be used freely
on the Subject: line, if RFC 1522 isn't implemented. It's like
writing an English Subject: header, if the letter "i" wasn't
allowed, and you hade to choose between using "y" instead, or
perhaps "ye", or the "{" character (because of some obsolete
7-bit national character set for the English language, no longer
supported by modern software). This message would have a header
such as

Subject: Re: MYME's "Content-Dysposytyon" Header


Subject: Re: MYEME's "Content-Dyesposyetyeon" Header


Subject: Re: M[ME's "Content-D{spos{t{on" Header

Analogous problems and bad fixes are exist in Swedish email
today, only _three_ Swedish vowels are illegal in non-RFC 1522
headers, not only one.

For an overview of the more than 70 different ways these three
letters of the Swedish alphabet may represented or
misrepresented in email and Internet news message bodies in
Sweden today, look at


Olle Jarnefors, Royal Institute of Technology, Stockholm