ietf-xml-mime
[Top] [All Lists]

Re: Registration of media typeimage/svg+xml

2010-11-18 15:52:50


On 18.11.2010 21:01, Chris Lilley wrote:
> On Thursday, November 18, 2010, 8:19:20 PM, Julian wrote:
>
> JR>  On 18.11.2010 19:02, Chris Lilley wrote:
>>> ...
>>> Security considerations:
>>> ...
>>>       SVG documents may be transmitted in compressed form using gzip
>>>       compression. For systems which employ MIME-like mechanisms, such
>>>       as HTTP, this is indicated by the Content-Encoding or
>>>       Transfer-Encoding header, as appropriate; for systems which do
>>>       not, such as direct filesystem access, this is indicated by the
>>>       filename extension and by the Macintosh File Type Codes. In
>>>       addition, gzip compressed content is readily recognised by the
>>>       initial byte sequence as described in [RFC1952] section 2.3.1.
>>> ...
>
> JR>  1) What does this have to do with "Security Considerations"?
>
> Please read BCP 13, RFC 4288 section 4.6 "Security requirements" where you 
will find
>
>        A media type that employs compression may provide an opportunity
>        for sending a small amount of data that, when received and
>        evaluated, expands enormously to consume all of the recipient's
>        resources.  All media types SHOULD state whether or not they
>        employ compression, and if they do they should discuss
>        what  steps need to be taken to avoid such attacks.

Read the section again. It is clearly talking about media types that employ
compression *internally*, not compression done at other layers.

Any media type can, and often is, compressed at other layers. Discussion
of such actions has no business being in any particular media type
registration.

Agreed.

But then it would need to be clearly stated, that, you know, the content
can be gzipped and still be image/svg+xml.

Can it?

This is indeed the question: If I have a static object of type image/svg+xml,
is it inside a gzip container or not? If the answer to this question is
yes, then:

(a) The security consideration section is appropriate, but needs to be
   clarified to state specifically that the media type employs compression
   internally, and

(b) The +xml on the type name MUST be removed, because the type cannot simply
   be processed directly as XML, which is what +xml means.

The same actions apply if the answer is "sometimes".

If, however, the answer is never - and I'm pretty sure it is - then all mention
of compression needs to be dropped from this registration, as it is doing
nothing useful and is just making things unclear. At most you might want a note
about it in the encoding consideration sections saying external compression is
often used with this type. Again, lots of media types are compressed at other
layers; this has nothing to do with the image/svg+xml media type specifically.
Because otherwise if you're talking about compression on the transport
layer, this doesn't need to be stated here. It confuses layers.

Exactly.

> JR>  2) I find the whole paragraph misleading; I'd like to see a clear
> JR>  statement about whether the stream of octets resulting from gzipping SVG
> JR>  can be labeled as "image/svg+xml" or not
>
> Not by itself, no. In a MIME context, it must be labelled as Content-type:
> image/svg+xml **AND** Transfer-Encoding: gzip. Please note the AND.

Sorry, you cannot make that a requirement of a media type. At most you can
suggest that a compressed encoding may be helpful. But general purpose media
types like this travel over all sorts of different transports all the time -
including ones that lack support for particular compression mechanisms - so the
notion that they're constrained to a particular transport is simply a fantasy.

So why we do have the paragraph above in the first place?

Exactly.

*Any* media type can be used with Content-Encoding: gzip over HTTP.

> This is not the same thing as Content-type: application/octet-stream and  
Transfer-Encoding: gzip - because that conveys the encoding, but omits the content 
type.

Nobody said that.

> In other words the encoding label ADDS TO the media type; it does not
remove the type.

"The Content-Encoding entity-header field is used as a modifier to the
media-type. When present, its value indicates what additional content
codings have been applied to the entity-body, and thus what decoding
mechanisms must be applied in order to obtain the media-type referenced
by the Content-Type header field. Content-Encoding is primarily used to
allow a document to be compressed without losing the identity of its
underlying media type." --
<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.14.11>

So once you apply the Content-Encoding you have to undo it to get back
the payload specified by the Content-Type. It's orthogonal. It doesn't
make the payload an instance of the media type *until* you undo the
encoding.

Correct.

> Indeed, this is why separate labelling of encoding was added. Back in the 
early days people would use gzipped VRML or gzipped PostScript, and attempted to 
register application/gzip; but since they were using the Media Type to hold the 
encoding information they had lost important information, so VRML viewers were 
sent PostScript and so on.  Some people said this was okay, unzip and then look at 
the filename extension. But a much better way was to add the encoding headers.
>
> JR>  (please consider transports
> JR>  other than HTTP, such as a file system that actually supports typing by
> JR>  Internet media types).
>
> Please feel free to file a bug report for the BeOS filesystem saying that
> it should support labelling of encodings in addition to media types.

How a particular OS elects to label files, and the restrictions it imposes
through it's choice of labelling, are 100% irrlevant to the matter at hand.

> Speaking as a former BeOS user myself, I still consider modern SVG
> implementations (of which there are many) to be a rather more numerous and
> relevant consideration than a promising, but obsolete and abandoned, operating
> system from 15 years ago.

Frankly, it would not matter if you were talking about all versions of Windows
here. Many if not most operating systems have botched media type handling in
some way or other; the solution to this problem isn't to break existing media
type semantics.

I really honestly (!) have no idea what you're referring to.

For the media type registration what's relevant is what kind of octet
sequences you can label with the type you register.

So, I hear you saying: "it can be gzipped when used in a MIME context if
and only if you label it with "content-encoding: gzip".

That's true, and nobody disagrees with it. It's true for *any* media
type. It doesn't require any additional statements.

Quite right.

> JR>  If yes, that's a violation of "+xml" (and the last sentence points into
> JR>  this direction). If not, please remove the paragraph above.
>
> JR>  3) If the intent is to say that "svgz" acts as file extension for
> JR>  gzipped SVG, and *that* content can be served over HTTP as-is with
>
> JR>          Content-Type: image/svg+xml
> JR>          Content-Encoding: gzip
>
> That is exactly what it says, yes
>
> JR>  than this is obviously ok
>
> I'm glad its obviously OK.

But the way it's stated is totally misleading.

Please keep in mind that I only joined this discussion after other
people complained (I stumbled into it during a conversation at the IETF
meeting in Maastricht).

> JR>  because it follows from RFC 2616, and has
> JR>  *nothing* to do with the media type (except for the extension
> JR>  recommendation).
>
> So you oppose reminding people how to detect such gzipped content?
>
> Why would you want to do that?

Because it makes it sound like detecting gzipped content by inspecting
the header is an acceptable way to handle this media type.

That's exactly what it does, and that along with the other confusion really
is not OK.

                                Ned