On 18.11.2010 21:01, Chris Lilley wrote:
On Thursday, November 18, 2010, 8:19:20 PM, Julian wrote:
JR> On 18.11.2010 19:02, Chris Lilley wrote:
...
Security considerations:
...
SVG documents may be transmitted in compressed form using gzip
compression. For systems which employ MIME-like mechanisms, such
as HTTP, this is indicated by the Content-Encoding or
Transfer-Encoding header, as appropriate; for systems which do
not, such as direct filesystem access, this is indicated by the
filename extension and by the Macintosh File Type Codes. In
addition, gzip compressed content is readily recognised by the
initial byte sequence as described in [RFC1952] section 2.3.1.
...
JR> 1) What does this have to do with "Security Considerations"?
Please read BCP 13, RFC 4288 section 4.6 "Security requirements" where you will
find
A media type that employs compression may provide an opportunity
for sending a small amount of data that, when received and
evaluated, expands enormously to consume all of the recipient's
resources. All media types SHOULD state whether or not they
employ compression, and if they do they should discuss
what steps need to be taken to avoid such attacks.
Agreed.
But then it would need to be clearly stated, that, you know, the content
can be gzipped and still be image/svg+xml.
Can it?
Because otherwise if you're talking about compression on the transport
layer, this doesn't need to be stated here. It confuses layers.
JR> 2) I find the whole paragraph misleading; I'd like to see a clear
JR> statement about whether the stream of octets resulting from gzipping SVG
JR> can be labeled as "image/svg+xml" or not
Not by itself, no. In a MIME context, it must be labelled as Content-type:
image/svg+xml **AND** Transfer-Encoding: gzip. Please note the AND.
So why we do have the paragraph above in the first place?
*Any* media type can be used with Content-Encoding: gzip over HTTP.
This is not the same thing as Content-type: application/octet-stream and
Transfer-Encoding: gzip - because that conveys the encoding, but omits the
content type.
Nobody said that.
In other words the encoding label ADDS TO the media type; it does not remove
the type.
"The Content-Encoding entity-header field is used as a modifier to the
media-type. When present, its value indicates what additional content
codings have been applied to the entity-body, and thus what decoding
mechanisms must be applied in order to obtain the media-type referenced
by the Content-Type header field. Content-Encoding is primarily used to
allow a document to be compressed without losing the identity of its
underlying media type." --
<http://greenbytes.de/tech/webdav/rfc2616.html#rfc.section.14.11>
So once you apply the Content-Encoding you have to undo it to get back
the payload specified by the Content-Type. It's orthogonal. It doesn't
make the payload an instance of the media type *until* you undo the
encoding.
Indeed, this is why separate labelling of encoding was added. Back in the early
days people would use gzipped VRML or gzipped PostScript, and attempted to
register application/gzip; but since they were using the Media Type to hold the
encoding information they had lost important information, so VRML viewers were
sent PostScript and so on. Some people said this was okay, unzip and then look
at the filename extension. But a much better way was to add the encoding
headers.
JR> (please consider transports
JR> other than HTTP, such as a file system that actually supports typing by
JR> Internet media types).
Please feel free to file a bug report for the BeOS filesystem saying that it
should support labelling of encodings in addition to media types.
Speaking as a former BeOS user myself, I still consider modern SVG
implementations (of which there are many) to be a rather more numerous and
relevant consideration than a promising, but obsolete and abandoned, operating
system from 15 years ago.
I really honestly (!) have no idea what you're referring to.
For the media type registration what's relevant is what kind of octet
sequences you can label with the type you register.
So, I hear you saying: "it can be gzipped when used in a MIME context if
and only if you label it with "content-encoding: gzip".
That's true, and nobody disagrees with it. It's true for *any* media
type. It doesn't require any additional statements.
JR> If yes, that's a violation of "+xml" (and the last sentence points into
JR> this direction). If not, please remove the paragraph above.
JR> 3) If the intent is to say that "svgz" acts as file extension for
JR> gzipped SVG, and *that* content can be served over HTTP as-is with
JR> Content-Type: image/svg+xml
JR> Content-Encoding: gzip
That is exactly what it says, yes
JR> than this is obviously ok
I'm glad its obviously OK.
But the way it's stated is totally misleading.
Please keep in mind that I only joined this discussion after other
people complained (I stumbled into it during a conversation at the IETF
meeting in Maastricht).
JR> because it follows from RFC 2616, and has
JR> *nothing* to do with the media type (except for the extension
JR> recommendation).
So you oppose reminding people how to detect such gzipped content?
Why would you want to do that?
Because it makes it sound like detecting gzipped content by inspecting
the header is an acceptable way to handle this media type.
Best regards, Julian