Re: MIME's "Content-Disposition" Header

I have the following comments on the substance of the
Content-Disposition Internet Draft.

Internet Draft: draft-dorner-content-header-00.txt
                                                  Rens Troost
                                                  Steve Dorner
                                                  August 1994

            Communicating Presentation Information in
                       Internet Messages:
                 The Content-Disposition Header

3.  The Content-Disposition Header Field

    Content-Disposition is an optional header; In its absence,
    presentation should default to `inline'.

    It is desirable to keep the set of possible disposition types
    small and well defined, to avoid needless complexity. Even
    so, evolving usage will likely require the definition of
    additional disposition types or parameters, so the set of
    disposition values is extensible; see below.

Disposition types defined in the future may be orthogonal to the
already defined types, although mutually exclusive among
themselves.  Therefore I think we should explicitly allow
multiple Content-Disposition: headers. If their disposition
types are incompatible, it may be prescribed that the first one
applies.  This would make it unnecessary to further increase the
complexity of the syntax of the Content-Dispostion: header.


I agree with this 100%.

3.2  The Attachment Disposition Type

    Bodyparts can be designated `attachment' to indicate that
    they are separate from the main body of the mail message, and
    that their display should not be automatic, but contingent
    upon some further action of the user. The MUA might instead
    present the user of a bitmap terminal with an iconic
    representation of the attachments, or, on character
    terminals, with a list of attachments from which the user
    could select for viewing or storage.

Of the two implementations outlined here the one for the more
primitive equipment, character terminal, sounds the more useful
to me! Instead of a bunch of icons on the bitmap terminal --
probably with only the word GIF in them, in the best case with a
dot pattern perhaps hinting on the main features of the image --
I can on the character terminal expect a list of the attachments
offered for viewing, probably showing the first line of the
Content-Description header, possibly also the suggested filename
and the approximate file size. In the character terminal case I
can thus expect to get more information on which to base a
decision to view or save an attachment or leave it.

Perhaps it is not needed to be specific in this text about these
human factors aspects of UA implementation?


I actually appreciate the existance of concrete examples here.

3.3  The Filename Parameter

    The sender may want to suggest a filename to be used if the
    entity is detached and stored in a separate file.

I suggest we here include words to the effect that the filename
specified must not be regarded as containing an initial part
specifying an absolute or relative path through a hierarchical
file system. If this is required, the following security
concerns may be dropped:

          o+ Creating or overwriting system files (e.g.,
            "/etc/passwd").

          o+ Placing executable files into any command search path
            (e.g., "~/bin/more").

          o+ Sending the file to a pipe (e.g., "| sh").


Just because we require this interpretation doesn't excuse us from
discussing the security issues that arise when the correct interpretation
isn't followed. However, I think it would be fine to include this
requirement, and cite these issues as reasons for having it.

By the way, are backspaces (^H), as used in the Security
Considerations section, really allowed in the plain text form of
Internet Drafts and RFCs?


Internet Drafts can be practically anything. RFCs are much stricter, and
my guess is that this won't be allowed. Only the RFC editor knows for
sure, of course.

Back to section 3.3.

                                                      If the
    receiving MUA writes the entity to a file, the suggested
    filename should be used where possible.

Considering what's said later about the dangers of certain
filenames and the role of the reciever in choosing filename,
this should be reworded to something like:

+     If the receiving MUA writes the entity to a file, the
+     filename specified in the Filename parameter can be
+     offered to the user as a suggested filename.

    It is important that the receiving MUA not simply blindly use
    the suggested filename.  The suggested filename should be
    checked (and possibly changed) to see that it conforms to
    local filesystem conventions and that it does not present a

                                  ^
                                  !
Add here: ", that it is not already used by an existing file"


I agree with this rewording. The risks associated with this have to be clearly
stated.

    security problem (see Security Considerations below).

In my opinion it would also be useful to include at this point
in the draft some advice for the _sending_ UA regarding the
portability of filenames. Something along these lines:

+     In email communication it is often the case that the
+     sender have little or no knowledge about the
+     capabilities of the receiver's system or even which
+     operating system the receiver uses. In many cases the
+     sender should therefore choose a _portable_ filename.
+     The UA can help the user by pointing out portability risks
+     in a chosen filename.
+
+     Two levels of filename portability are relevant in
+     an international context:
+
+     1) "Conservative" portability requirements: The filename
+        consists of 1-8 characters, possibly followed by "."
+        and 1-3 characters. The first character of both of the
+        filename parts is a letter. The characters are chosen
+        from this subset of US-ASCII:
+
+        ABCDEFGHIJKLMNOPQRSTUVWXYZ
+        0123456789
+
+     Filenames satisfying these requirements can in general be
+     used without problems in the operating systems MS-DOS,
+     Apple Macintosh, Unix, VMS, VM, MVS, OS/2, Microsoft
+     Windows NT. Warning: In the MS-DOS operating system a few
+     initial filename parts are unusable, since they are
+     reserved for devices. Among them are AUX, COM1, COM2,
+     COM3, COM4, CON, LPT1, LPT2, LPT3, NUL, PRN.
+
+     2) "Optimistic" portability requirements: The filename
+        consists of 1-31 characters. The characters are chosen
+        from this subset of US-ASCII:
+
+        ABCDEFGHIJKLMNOPQRSTUVWXYZ
+        abcdefghijklmnopqrstuvwxyz
+        0123456789
+        !#$%&+-(_dot_)=(_at_)[]^_`{}~
+
+        The filenames are regarded as case-insensitive, though,
+        if filenames are provided for several different body parts.
+
+     Of the US-ASCII characters, control characters, SP and
+     these are excluded:
+
+        "'()*,/:;<>?\|
+
+     Filenames satisfying these requirements can in general be
+     used without problems in modern operating systems allowing
+     "long" filenames, such as Apple Macintosh, modern variants
+     of Unix, OS/2, Microsoft Windows NT.


I think this is an excellent idea.

Considering that the conservative approach # 1 makes
"self-explaining" filenames impossible in most cases (mostly
because of the length restriction) I propose that a new
parameter, Portable-Filename, is introduced. Its values would
have a restricted syntax. It could be used together with the
Filename parameter like in this example:

Content-Type: image/gif
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
 filename=Futhark-A-fonttable-300.gif;
 portable-filename=FUTHARKA.GIF
Content-Description: A table showing all glyphs of the main
 runic font Futhark A, in 300 dpi resolution.


Interesting idea -- I think its fine unless it proves to be controversial.
If it does we can always define it in a separate document.

    The value of the filename parameter must be in US-ASCII.
    However, it is possible to use arbitrary characters in the
    filename by using the "quoted- phrase" construct and
    [RFC 1522] encoding.  There is an ambiguity between
    quoted-string and quoted-phrase.  It should be resolved in
    favor of the quoted-phrase when possible; a filename fitting
    the syntax of a series of encoded-words and atoms should be
    treated as such.

I would prefer a new encoding method, something similar to what
Keith Moore has proposed recently, in
<199501132117(_dot_)QAA17022(_at_)wilma(_dot_)cs(_dot_)utk(_dot_)edu>.


Agreed.

It's important that the charset of the filename itself can be
indicated, but in view of possible future parameters with purely
binary values the charset indication should be optional. I
suppose this isn't possible if an unmodified RFC 1522 encoding
is used, spurious charset values will have to be used.


I hate this, but you're right -- the character set does have to be
included.

3.4  Future Extensions and Unrecognized Disposition Types

    In the likely event that new parameters or types are needed,
    they should be registered with the IANA, in the manner
    specified in [RFC 1521], appendix E.

An adapted version of registration form E.2 should be included
in this draft.


Registration procedures in technical specifications have proved to be
problematic. I'm working on a single RFC that defines the registration
procedures for MIME-related things now, and I can include the procedure for
content-disposition in it if the group approves.

The logic here is to have a nontechnical document that describes the
procedures that can be updated in the event of procedural changes without
messing with the protocol definition documents.

An important aspect of saving a body part to a file, that is not
covered by the present draft, is which sections of the body part
should be saved: the body of the body part, the headers of the
body part, or both. It might also be appropriate to save both
sections, but in different files.


Yuck! I have always assumed that the headers are not saved by default, and
let the user override if need be, but I suppose a parameter to control
the default is not unreasonable...

One approach to this problem is to indicate a "reasonable
behaviour" for the different content-types defined in
draft-ietf-822ext-mime-imb-01.txt. I think this can be said:

   Text/Plain: Save body. Reverse transport-encoding and convert to
   the local character set (if needed and possible).

   Image/GIF: Save body. Reverse transport-encoding. Choose a
   filename ending in ".gif".

   Image/JPEG: Save body. Reverse transport-encoding. Choose a
   filename ending in ".jpeg".

   Audio/Basic: Save body. Reverse transport-encoding. Choose a
   filename ending in ".au" (or whatever is dominant in current
   practice).

   Video/MPEG: Save body. Reverse transport-encoding. Choose a
   filename ending in ".mpeg".

   Application/Octet-Stream: Save body. Reverse transport-encoding.

   Application/PostScript: Save body. Reverse transport-encoding.
   Choose a filename ending in ".ps".

   Multipart/Mixed and Multipart/Digest: Ask the user if the body
   parts should be saved in individual files. If not: Save both
   headers and body. Choose a filename ending in ".msg".

   Multipart/Alternative: Find out if the user want's only the best
   alternative to be saved. If not: Save both headers and body.
   Choose a filename ending in ".msg".

   Multipart/Parallel: ?

Wasn't this content-type to be scrapped? What is otherwise the
semantics of Content-Disposition: attachment with this
content-type?

   
Not as far as I know. There was talk about defining some more sophisticated
subtypes, but there was no intention of scrapping parallel that I recall.

   Message/RFC822: Save body (which, however, itself consists of
   both headers and body). Choose a filename ending in ".msg".

   Message/Partial: Try to find all the message fragments and
   reassemble them. Save the reassembled headers and body. Choose a
   filename ending in ".msg".

   Message/External-Body: Find out if the user wants the body data
   to be fetched now or later. In the first case, do that and save
   the reconstructed body (which, however, itself consists of
   both headers and body). Choose a filename ending in ".msg".


I think the idea of defining defaults for types registered in the base
MIME specification is a good one, but it probably should be done in a
separate (little) document. As for new types, this needs to be made part
of the registration procedure.

I have here proposed the filename extension ".msg" in all
cases where the resulting file has the format of an RFC 822
message. If the headers and the body are saved in two separate
files, I would recommend that the name of the headers file is
formed by adding ".M" (for "meta information") to the name of
the body file. (This presupposes a file system that allows long
filenames.)

These pragmatic guidelines should not always be followed, of
course. For example, the headers of a Image/GIF body part may
contain useful information that is not captured by the filename,
such as a Content-Description: or a Content-ID: header. A
Text/Plain body part may include useful Content-Language:
information.


Not to mention what needs to happen on systems that don't use filenames
to provide type information...

The sender may be able to specify the best behavior. I would
therefore like to propose yet another new parameter,
File-Coverage, taking one of the values: entity, headers, body.
It should apply to the nearest preceding Filename parameter and
Portable-Filename parameter. This means that filenames for both
a headers file and a body file can be specified by multiple
Filename and Portable-Filename parameters. The order of the
parameters in a Content-Disposition: header will then be
significant.


Hmm. This is getting complicated. I think this is best deferred to another
specification.

                                Ned