Re: Using CONNEG instead of MIME types for compound types & references



Murray Altheim wrote:

The second problem is as Larry points out is the file with
embedded xml from different namespaces. [...]
It would be nice from the browsers
point of view if it had prior knowledge of the namespaces before
downloading the document, because then it could either


This is one of the (pardon my French) stupidities of embedding what
is essentially prolog information deep within an instance.


I believe that the intention was to allow scoping.

In
traditional SGML systems the benefit of having a prolog is that the
engine can learn all about the document instance *before* it begins
processing. This is simply poor design on the part of XML Namespaces,


No, because the WG originally came up with an unscoped design with
everything in a prolog; but feedback was that this design was
insufficient.

probably brought about by design constrains or 'market pressure'.


It was indeed brought about by design constraints; specifically, the
need to be useful. You might not like the specific choices that were
made, but it is misrepresentation to call it "stupid" and "poor design".

Others such as Murray (I hope I am not mis-interpreting him!)
have pointed out that they would like a more general solution,
because they feel that an XHTML mime type will keep XHTML off
in it's own landscape and not encourage it to join the more
general XML solution.


Well, yes. And since there is a strong requirement that a solution be
found for XML, if a stop-gap solution is arrived at for XHTML it will
probably be *different* than XML, which would further push XHTML off
into its own unique landscape. I doubt vendors would want to do both.


Actually, if the choices are to label it as text/html or text/xml, the
worry is that text/html will be chosen which, indeed, keeps HTML of in
its own landscape; it means that processors have no way to distinguish
between well-formed xml in the xhtml namespace, and  totally unspecified
"real world"html with all its myriad undocumented parsing tricks for
alleged backwards compatibility.

So, being able to say "this uses HTML semantics, but is well formed xml"
is very valuable. A specific MIME type is the obvious way to achieve
that.

As Rick has pointed out, XHTML is 'application-specific', but no
more so than MathML or any other XML application that requires
specialized processing not provided in a stylesheet. It also happens
to be potentially the most widely used XML markup language, and a
framework for creation of many others. So we all need to work together
to create a solution that works for both XHTML and XML in general.


If an instance is well formed xml, and uses just the xhtml namespace,
and thus can be validated, label it as text/xhtml.

If an instance is well formed xml, and uses multiple namespaces (and
thus, until schemas arive, cannot be validated) call it text/xml if you
don't have a better and more precise name.

If an instance is, well, stuff (not well formed, could be anything, like
real-world html) then call it text/html. The battle to have text/html
bear any resemblance to its defining spec was lost long ago. Processors
that accept text/html have to be prepared to process just about
anything; if it happens to be well formed xml, well, such processors

a) will not even notice
b) will not parse it according to the xml spec in any case

As a result, anything labelled as text/html is a dead loss in terms of
consistent parse trees between implementations; with expected results on
CSS, XSL, the DOM, and anything else that depends on a clean parse tree.

--
Chris

--
Chris