Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue)
2007-08-03 04:51:24
Greetings.
On Thu, 02 Aug 2007 19:14:32 -0600, Abel Braaksma
<abel(_dot_)online(_at_)xs4all(_dot_)nl> wrote:
About the spec thing, isn't it something from SGML heritage? I
mean, didn't XML introduce the shortcut <br /> for <br></br> thus
disallowing the SGML <br> on itself (without closing tag)? And
wasn't it also SGML heritage that allowed <option selected> and XML
forced more strict rules and made it <option selected="selected">?
SGML was designed at a time, and in a context, which assumed that
document markup would be entered by hand; it therefore included a
large number of short forms, and ways to minimise typing.
These included omitting redundant endtags (so that in an HTML-like
DTD, "<h1>title<p>para1<p>para2" would be OK, since an h2 element
can't contain a p, and a p can't contain another one, so that the
presence of the implicit end-tags, closing the h1 and p elements,
could be inferred). There were various attribute-defaulting and tag-
minimising tricks as well, so that <p>text</> was valid, with the </>
construct closing the most recently opened tag. And so on and so on.
The even cleverer thing about SGML (and one of the various things
that made it complicated to write an SGML system) was that the syntax
of the SGML lexer was specifiable on the fly. Starting tags with the
'<' character, starting the end-tag with '</', having quotes marked
with '"', using the ASCII character set, using letters as element
names, were the default, but were all optional.
That brought about the "NET-hack". You could specify that the null-
end-tag (NET) start string was '/' rather than '</', thus bringing
about the sequence of transformations
1. <p></p> (fully normalised form)
2. -> <p</p> (you didn't have to close tags if you were starting a
new one immediately)
3. -> <p</> (use the null end tag </> to close the most recently
started element)
4. -> <p/> (if you had redefined the NET string from '</' to '/').
...and <p/> was deemed to look adequately pretty (I might be
misremembering this slightly, but it was something very like that).
Although it didn't end up specified quite like that, XML was
initially viewed (by some) as a specific set of settings for the SGML
lexer, which turned off all the options and minimisations. Because
the end result had no contractions and no options, it was massively
easier to write parsers for. That is, XML is SGML-- (ahem!).
Pace Andrew Welch, HTML usually isn't parsed with an `SGML parser',
but with a special-purpose never-fail make-it-up-when-necessary HTML-
specific parser. John Cowan's tagsoup parser is one of a couple of
SAX parsers which will accept HTML tag soup and always emit a valid
SAX stream.
Andrew remarks:
My point was that if it made no difference to the XML parser (but a
big difference in the Real World) then why not?
Ian Hickson makes some relevant remarks at <http://www.hixie.ch/
advocacy/xhtml> suggesting, in some detail, that sending out XHTML
with a text/html content type can potentially cause you problems.
The other _really_ good thing about SGML was that it had DSSSL as a
transformation language. By the look of things, XSL2 is expanding
towards being a small subset of DSSSL (but that's another
hobbyhorse). DSSSL can of course be used to process XML, but it's a
bit of a minority interest, these days.
All the best,
Norman
[drifting down memory lane]
--
------------------------------------------------------------
Norman Gray : http://nxg.me.uk
eurovotech.org : University of Leicester, UK
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), (continued)
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), David Carlisle
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), David Carlisle
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), Norman Gray
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), G. Ken Holman
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue),
Norman Gray <=
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), Andrew Welch
- Re: [xsl] IE Client side transformation issue, David Carlisle
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), Abel Braaksma
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), David Carlisle
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), Abel Braaksma
- Re: [xsl] Understanding why <tag></tag> is the way it is (was Re: [xsl] IE Client side transformation issue), Abel Braaksma
- Re: [xsl] IE Client side transformation issue, Manfred Staudinger
Re: [xsl] IE Client side transformation issue, David Carlisle
|
|
|