ietf
[Top] [All Lists]

XML2RFC must die, was: Re: Two different threads - IETF Document Format

2009-07-05 09:25:26
My apologies for the subject line. I'm very disappointed that the silent majority of draft authors isn't speaking up. I can't imagine that the vast majority of draft authors has absolutely no problems with XML2RFC. So I'm assuming they've been ignoring the thread, hopefully the new subject line will get some of them to chime in. If that doesn't happen I'll shut up and try to figure out why I have so much trouble with something that nobody else finds difficult.

On 4 jul 2009, at 13:27, John Levine wrote:

I think it's reasonable to assume that going forward the vast majority
of users who read online documents will be able to use software that
can reformat them in various ways.  This tells me that although the
publication form has to be readable in a pinch as plain text, it's
more important that it's amenable to mechanical processing.  Tidily
formatted xml2rfc would be a reasonable candidate

No, it's not. The problem with XML2RFC formatted drafts and RFCs is that you can't display them reasonably without using XML2RFC, and although XML2RFC can run on many systems in theory, in practice it's very difficult to install and run successfully because it's written in TCL and many XML2RFC files depend on the local availability of references. When those aren't present the conversion fails.

The philosophy behind XML2RFC is to encode meaning in the XML wherever possible, rather than simply display text. There are several problems with that:

1. It makes it hard to write source files, because now rather than type "Experimental" at the top of the file, I have to know what XML2RFC looks for to determine the draft's status. Same thing with boilerplate, references, etc.

2. It makes it hard to read source files for the same reason. You can't read an XML2RFC formatted XML file without prior knowledge and get all the information that would be displayed in the final draft/RFC format.

3. It gets it wrong. XML2RFC "knows" that you create a name from an initial, a period, a space and a last name. So initial "I" and last name "Van Beijnum" becomes "I. Van Beijnum". However, XML2RFC doesn't know that in Dutch, certain last name prefixes are capitalized if they appear at the beginning of the name (Van Beijnum) but not if they're in the middle because there are first names or initials: "I. van Beijnum".

This means that the makers of XML2RFC spent a lot of time making the tool require the authors to spend a lot of time to create something that is sometimes incorrect, with no means to correct the problem. An all-around waste of time.

Then there is the problem with XML in general. Now apparently there are XML editors that can make sure you create syntactically correct XML without having to take care of all the details manually. But as someone who has otherwise no need to write XML, I'm not familiar with those tools. So I write my XML2RFC source by hand. The result is that I invariably get error messages that the <section> and </section> tags don't match properly. This is a problem that is extremely hard to debug manually, especially as just grepping for "section" isn't enough: there could be a <!--, -->, </middle> etc somewhere between a <section> and </section> that breaks everything.

First writing a source file and then compiling it into an output file is no longer something something that is familiar to most people. When I write anything other than a draft, I can simply select "header level 2" and I know that everything will be taken care of. I don't have to explicitly tell my word processor where the text following a header level 2 ends, because the presence of another header makes that clear both to me and to the software.

What we need is the ability to write drafts with a standard issue word processor. I'm sure that sentence conjured up nightmares of Word documents with insane formatting being mailed around clueless beaurocracies, but that's not what I mean. Word processors use styles to tag headings, text, quotes, lists and so on: the exact same stuff that you can do in XML but rather than having to think about it (especially closing all tags correctly) it happens easily, automatically and without getting in the way. (I can even change the style for an entire paragraph with a single menu selection or function key without having to find the beginnings and ends of that paragraph.)

Formatting is then based on the style tags, with all explicit formatting aplied by the word processor removed. This is standard operating procedure in 99% of publishing. (The other 1% being scientific/engineering books where the authors send in Latex.)

All the stuff that can't be handled by styles should just be copied and pasted from the boilerplate, without the need for tools to know about the structure of these things. (At least not in the draft stage, perhaps this can be useful in the final stages of RFC editing.)
_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf

<Prev in Thread] Current Thread [Next in Thread>