RE: Comments on draft-ietf-marid-core-01 xml use


On Thu, 2004-06-03 at 13:39, Jim Lyon wrote:

In his comments, Doug makes several points, which I summarize and
respond to as follows:  (If I mis-interpreted any of them, I'm sure Doug
will chime in.)


1.  [Doug] The method for publishing definitions for XML using a DNS TXT
record can be extremely verbose.  Removal of any size considerations in
this draft such as keeping the query/response under 512 bytes is
unfortunate.

I believe that it will be easy for publishers to keep their DNS
responses under 512 bytes.  I would be happy to add a RECOMMENDATION to
the document that publishers do so.


This should be much stronger than a recommendation.  Something like:

Currently there's an ingrained limit of 512 octets for UDP replies where
DNS servers may utilize TCP transport to accommodate larger replies. 
RFC 2671 which expands the UDP limit if employed and could create a DoS
exploit.  Replies requiring more than 512 octets create UDP
fragmentation and, depending on the connection and handling, may cause
partial replies where clients often fail-over to TCP.  TCP connections
require maintaining state for 5 packets of setup and tear down in
addition to the data packets, thus adding a delay of at least 3 round
trips plus one for the original UDP query.  Furthermore, delivery and
resolver handling of truncated and partial responses varies, leading to
additional delays.  Should TCP be employed for a common reply, the
ability to sustain DNS service would be reduced by higher overhead with
a greater potential for DoS.  Domain administrators are strongly advised
to keep DNS replies below 512 octets for these reasons.

2.  [Doug]  The document should be standalone (in the sense of <?xml
standalone="yes"?>).

I'm not sure how the standalone declaration interacts with XML schemas.
I'll find out.  (It pre-dates the whole schema effort by a few years,
and specifies that all of the XML DTD info is present in the document.
Given that none of the MARID docs can have DTDs, this is vacuously
true.


Extensible Markup Language (XML) 1.0 (Third Edition)
W3C Recommendation 04 February 2004
http://www.w3.org/TR/2004/REC-xml-20040204/#NT-XMLDecl
http://www.w3.org/TR/2004/REC-xml-20040204/#sec-rmd
2.9 Standalone Document Declaration
...
"Markup declarations can affect the content of the document, as passed
from an XML processor to an application; examples are attribute defaults
and entity declarations. The standalone document declaration, which MAY
appear as a component of the XML declaration, signals whether or not
there are such declarations which appear external to the document entity
or in parameter entities."
...
"Any XML document for which standalone="no" holds can be converted
algorithmically to a standalone document, which may be desirable for
some network delivery applications."

It would seem standalone was conceived to handle this situation.

3.  [Doug] Other words about standaloneness, implying (I think) that
implementations shouldn't be required to load arbitrary schemas to
understand a document.

I completely agree with the goal.  I would expect an implementation to
have those schemas that it understands (initially just ...:marid-1)
hard-coded into it.  By explicit words in the spec, an implementation is
required to ignore elements and attributes whose schema it doesn't
understand.  It need not search to find a schema document for other
namesapces, it can just ignore the elements.


By including standalone='yes', the processing of the information is both
better defined and easier to process.

4.  [Doug] We should require that a conforming document never have any
references to a namespace other than ...:marid-1 (or possibly
...:marid-*).

Allowing references to as-yet-undefined namespaces is an important part
of the extensibility.  As extensions are defined, it allows publishers
to write documents using the new extensions, yet remain compliant with
the current version.  Without this, it would be suicidal for a publisher
to take advantage of a new extension -- his document would be seen as
invalid by any implementation that hadn't yet been updated to understand
the extension.


Define the record within the specification and register the document
with IANA.  Add to this 'standalone' document
 <! TXT rr starts here>
 <! TXT rr ends here>

Declare in the draft this singular document encompasses the entire set
of definitions and there should be some wording baring any attempts to
add further definitions within the TXT record.  Call this document
MARID-1.  If there is ever to be a MARID-2 it should fill-in the
definitions or use a different label to introduce a different record.

As far as extensibility, a marid version beyond that which is known
should remain compatible with those declarations defined previously.  
Multiple namespace declarations preceding their definition can be done
within a single document in the same manner.  I refer you back to the
XML specifications.

5.  [Doug] The wrapping added to the DNS record to get the XML document
should include the <ep> and </ep> tags.

I intentionally didn't do this (at the cost of 9 bytes), to allow for
the possibility of adding attributes to the <ep> tag.  For example, the
possibility of <ep testing="true">.  If we don't want this possibility,
I would just as soon make the <ep> tag go away completely.


Perhaps I did not understand your explanation as to how the prefix and
suffix were to be coupled with the TXT rr.  From my reading,

From Section 5.3.1 an XML TXT Record-

3. Prefix the resulting string with the following: 
      <?xml version="1.0" encoding="UTF-8"?> 
      <root xmlns="urn:ietf:params:xml:schema:marid-1" 
            xmlns:m2="urn:ietf:params:xml:schema:marid-2" 
            xmlns:m3="urn:ietf:params:xml:schema:marid-3" 
            xmlns:m4="urn:ietf:params:xml:schema:marid-4" 
            xmlns:m5="urn:ietf:params:xml:schema:marid-5" 
            xmlns:xsi="http://www.w3.org/2001/XMLSchema"; 
            xmlns:ds="http://www.w3.org/2000/09/xmldsig#";>

This should be removed and replaced with a document split as a prefix and
suffix identified with a single token _MARID-1 at the beginning of the
TXT rr.

6.  [Doug] Why does the document say "UTF-8" instead of "US-ASCII"?

By specifying UTF-8, the spec says exactly what should be done with
characters whose high bit is set.  However, given that some DNS
resolvers may (incorrectly) treat records containing non-ASCII
characters as malformed, I would be happy to add a RECOMMENDATION to the
document that publishers should avoid non-ASCII characters in their
documents. (Note that if you include only ASCII characters, it's
automatically valid UTF-8.)


Again, why the departure?  What is lost by simply declaring the encoding
ASCII?

-Doug