ietf
[Top] [All Lists]

Re: Last Call: <draft-ietf-dnsop-dns-terminology-03.txt> (DNS Terminology) to Best Current Practice

2015-08-05 19:49:43
On 3 Aug 2015, at 14:08, John C Klensin wrote:

(1) _Status observation_:

Sure, we can go to Informational for this document. Our intention for the -bis RFC is to actually update the RFCs with definitions that we are changing, so that should be a BCP. Off-list, the authors agreed with your logic, so the next draft will say "Informational".

(2) _Usability: Finding terms_

Because this document is long and organized by categories that
might not be obvious to the reader, it would benefit
significantly from an alphabetical index of terms defined and
either numbering those terms (with the term number in the
index) or at least using the index to localize terms to the
sections in which they appear.  Without such an index or
similar organizational mechanism, I fear the document will be
nearly useless as a reference and that few people will read it
from beginning to end to determine which terms to use and
remember all of the definitions.  Documents like this are, at
least IMO, much more often used to find out what a term means
or determine whether it is being used appropriately; either of
those uses would strongly benefit from an index.

What you say is true only for the few people reading printed-out copies of the RFC. These days, most people read RFCs on their computers. People who are looking for a particular term can quickly skim the left column or use the search function of their favorite text editor, web browser, or PDF reader. (For historical reference, John and I disagreed about this when he and I co-authored RFC 6365.) Once we have the new RFC format and HTML representations of RFCs, indexes will become much more useful.

(3) _Procedural Observation_

For the terminology itself, I have not been following the
progress of this document in DNSOP so, if specific issues raised
below have been thoroughly discussed there and I don't raise new
issues about them, I'm happy to see things go ahead on the basis
of the WG's decisions.  However, note that, if any assertion of
prior discussions defies belief, I may ask for a pointer to
specific discussions in mailing list archives or meeting
minutes.  Many of the remarks below are quibbles but, IMO,
terminology documents are one of the places where we really
should quibble enough to get things right, lest they add to the
confusion, rather than reducing it.

The WG quibbled quite extensively on this document, and there are many places where there is not wide agreement.

(4) _The dot-foo problem (and a missing definition for
"dotless")_

Meta-definition: Despite the popularity of its usage, I believe
this document should not encourage the use of ".foo" to refer
to the "foo" TLD (as the document points out, if "foo" is an
FQDN, it is more properly written "foo.").  If we encourage
"dot-foo" (or ".foo"), at least without a lot more explanation,
we are only a step from "dot-label1.label2" (or to be precise
and difficult ".label1.label2." where "label1.label2." is the
name of any node that contains either delegation records or
non-delegated subdomains).  While its focus was somewhat
different and it never got any attention in DNSOP or elsewhere,
the expired
https://datatracker.ietf.org/doc/draft-klensin-dotless-terminology-harmful/
discusses another aspect of this problem.  One of its
implications for the present I-D is that, if organizations like
the IAB are going to use the term "dotless domain" as if it
meant something, "dotless" or "dotless domain" should appear in
the present I-D, even if to denounce its use.

No one in the WG expressed any concern about using ".foo" to describe the "foo" TLD. The term "dotless" does not appear in this document, and no one has asked for it, so it would be impossible to predict if adding it would include a denunciation. You may be one of the very few people who associate the common use of ".foo" with the term "dotless domain".

"Host Name" and "Domain Name": It is probably worth noting that
not only is "host name" sometimes used to denote the first
label in an FQDN (as the document indicates) but that many
vendors have established a practice of setting up, e.g.,
configuration tables or options that distinguish between "host
name" with that definition and "domain" as the rest of the FQDN
with that label excluded.  Guilty parties include Windows (at
least through version 8.1), FreeBSD, Cisco (at least some
products), a few ICANN-accredited registrars, and probably many
more, so this cannot be dismissed as a fringe issue.

Saying that "some vendors" do something like that does not seem valuable to this document since it does not further the definition already in the document. "Some vendors" do lots of good/bad/weird things. This document is meant as a terminology list, not a full description of the deployed DNS. (John will understand why I said that last bit; he and I were faced with the same issues when we were doing RFC 6365. If you think DNS deployment is inconsistent, wander over to i18n world sometime...)

"Label": Binding "label" to "node" is a fine plan, but leaves
the fairly frequent practice of using a string with a dot
embedded in it as the owner of a node.  That is a practice that
makes the first substring of the label indistinguishable
lexically from a [delegated] subdomain (see below) of the
second substring.  This obviously interacts with the meaning of
the term "subdomain".  That term is used a few times in the
document but not defined.  At least because of this issue with
labels and the related question of whether a "subdomain" exists
only with a zone break, it seems imperative to me to define and
discuss it... or to discourage its use and otherwise get it out
of the document.

I'm unclear on what you mean by "as the owner of a node".

Good catch on the lack of definition of "subdomain", which is used twice in quoting other RFCs and once bare. Proposal:

Subdomain: a label or group of contiguous labels that are a child of given domain. For example, in the host name "nnn.mmm.example.com", both "mmm" and "nnn.mmm" are
   subdomains of "example.com".

"Host Name" (again): Another popular definition of this term
associates it only with domain names that actually identify
hosts, i.e., it is the owner of a node that contains address
records and not only, e.g., delegation records, service
indicator records (MX, SRV, NAPTR, URI, etc.) or aliases (CNAME
or DNAME records).   It would be consistent with the rest of the
I-D to mention that usage, even if only to discourage it.

I'm not seeing that in any RFCs, but I could be missing it. If you can send one example of that usage, it could certainly be added as a now-discouraged practice.

"IDN": I think it would be wise to be careful about two things
in this definition.  Because of them, it is probably not right.

First, while IDNA2008 is quite specific about referring to
labels, "IDN" has been popularly used in at least three
other (and inconsistent) ways: (i) to describe FQDNs with at
least one label that requires IDNA handling; (ii) for FQDNs
and/or labels containing non-ASCII characters (including
domain names that use UTF-8 or ISO 8859-x (typically 8859-1)
directly in the DNS; (iii) and only those FQDNs all of whose
labels (other than the root indicator) contain
representations of non-ASCII characters.  Note that RFC 6055
claims to be about "Internationalized Domain Names" and,
among other things, discusses the second case.  I recommend
the document explore those alternate meanings and then
deprecate the use of "IDNs" in protocol-like discussions of
the DNS (or elsewhere where precision is required) and
focus on, e.g., "IDNA labels" or "IDNA2008 labels".

Given that RFC 6055 did indeed mix those definitions together, it might be better to simply refer to it for more description and leave the current cleaner definition alone.

Second, there is the matter of whether a putative label that
contains characters that must be processed into an A-label
or U-label for use with the DNS is reasonably referred to as
an "IDN" or "IDN label" (a subject on which UTR 46 and
various web-oriented documents contribute to the confusion).
I believe this document needs to identify and discuss those
distinctions, not just treat "IDN" as s synonym for
something IDNA-ish.

I doubt that this document needs to talk about what is and is not a "putative label", given that this is not a term at all common. The documents that are referred to in this document do that.

While it is not (at least IMO) a
sufficient solution for any of those issues, it would be
reasonable to include an informative reference to RFC 6365
in that definition for additional information.

Sure.

"Alias": See the discussion of "subdomain" under "label" above.
Partially as a result, I don't know what this definition means.
I also suggest that ordinary and common usage of the term
"alias" means that the (fully-qualified) owner of a DNAME
record is an alias as well, and that "subdomains of the owner"
may additionally be aliases.

This is a definition quoted from a standards-track RFC. If it is wrong or misleading, it would be good fodder for the -bis of this document or as an update to that RFC.


"Canonical name": the first sentence of this definition is
circular and "canonical name" is not actually defined.

And yet that is the definition directly from STD 13.

I've
heard uses of "canonical name" to refer to fully-qualified
targets of fully-qualified owners of DNAMEs as well.  If the WG
doesn't like that usage, it would be appropriate to say
something.  I've never liked it, and RFC 5321 deliberately
avoids it, but I've also heard "canonical name" used to describe
the domain that appears in the RDATA of MX RRs. "Canonical
form", and sometimes "canonical name" have also been popular in
some communities to refer to the A-label or U-label form of an
IDN label (sic) after alias/ mapping processive via RFC 5895,
UTR 46, or other specifications.  It may be that we should start
moving away from the term in the context defined in the I-D and
start using "alias target" or something to that effect.

"Public suffix": the discussion of the "UK." (and, by the way,
"US.") zone would be clearer and the source of controversy more
obvious if the I-D explicitly pointed out that policy changes
have resulted in _both_, e.g., "ac.uk." and "uk." (and
"NH.US".  and "US.")  meeting the criteria for public
suffixes at the same time.

Because of that confusion, we have changed the example from ".uk" to ".au" for the next draft.



_Section 3_

Nit about apparently-missing definite articles: in the first
sentence, s/is first 12/is the first 12/ (or rewrite to use a
word like "comprise").  Similarly, the first sentence of the
next paragraph.  See also "...fields in the RDATA an SOA
resource..." later in the document, where the omission actually
makes the sentence hard to understand.

Good catches. Will fix.


"Referrals": "zone cut" is used here in a way that is critical
to the explanation but defined only in Section 6.  A
cross-reference or forward pointer would be in order.

Yep.


_Section 4_

"RRset": This definition is consistent with the usage in
DNNSEC, i.e, that resources records with the same owner and
class but different types are not part of the same RRset.  That
is fine, but I've seen it used to encompass all of the records
that might be returned in response to a query for a particular
owner node with QTYPE=ANY, so, consistent with other sections
of the I-D, that other case should probably be mentioned (and,
obviously, deprecated).

Can you point to an RFC or draft (even an antique one) that uses it in that sense? If not, we should probably leave it out, given that it is clearly wrong. If so, we should add it in the "deprecated usage" category.

Is there an approved term for the
collection of resource records with the same owner and, if so,
where is it in this document?  This definition would be much
stronger if it could say "This is an RRset; those other
categories of things that are its supersets are called X and
Y."

None that I know of, but others should chime in here if they do. If it exists, it would be a good term to include.

"Owner" and "SOA Field Names": One reason why "Owner name" (see
"Owner" paragraph) is often used is that the RNAME has been
referred to as "Owner" in dozens of documents and probably
uncountable numbers of  comments in zone files.  "Responsible
party" is also used, e.g., in RFC 1034.  It might be useful to
explicitly note that the I-D prefers the names given to pieces
of field formats in RFC 1035, just it elsewhere appears to
prefer names used in association with query responses rather
than those that describe the fields themselves (see comments
early in this review about status and audience).

Yep, good addition.

"TTL": Errata, even "IESG approved" errata, don't change
standards-track documents at least unless the errors are
glaringly evident typographical errors (but still confusing
enough to justify errata).  Section 8 of RFC 2181 is extremely
clear about this and does update 1035, so it would be more
appropriate to replace "it is fixed in an erratum" with
something like "this is clarified in RFC 2181" or, "consistent
with an erratum, this is clarified in RFC 2181".  Given that
change and for editorial reasons, it would be better to
consolidate the two consecutive parenthetical remarks or to
incorporate the second one into the text.  See also comments
above about the tutorial, rather that terminology, nature of
much of this subsection, especially the next-to-last paragraph
("The reason that...").

Sure. (Without agreeing with you about erratum never being able to change standards track documents; that discussion is for the IESG / IAB / RFC Editor, not us.)

"Class independent": While this definition seems ok to me, at
least given my limited knowledge, recent discussions on the IETF
list and elsewhere have been very negative about the actual
usability of CLASSes, at least on the public Internet. There
seem to multiple issues, of which one example is associated with
what a zone (or at least a master file) that contains records
from more than one CLASS means and how it is accessed and
another is associated with CLASS independence - a rather
sweeping claim for any RRTYPE that is not associated with the
structure of the DNS itself because it seriously constrains
later behavior and decisions.  Given those issues and especially
given some of the tutorial material elsewhere in this document,
it seems strange and possibly inappropriate to define Class
independence this way and to do so without any further
explanation.

Why? The recent discussions on the IETF list would be less confusing if people read the definition in this draft.

See the discussion of "Priming" and other terms
below for additional examples of the dimensions of this rat
hole. Recommendation: unless the authors and WG have the energy
to discuss CLASSes in detail, drop this definition and put a
sentence or two into the introductory material that indicates
that CLASS use and terminology raise additional issues and are
deferred to other documents.  That gets them out of the
immediate critical path and, should the WG decide to deprecate
them, saves a lot of energy getting the terminology right.

The fact that some people don't notice that some RTYPEs are not class independent does not lead to us not defining it.



_Section 5_

"Full resolver": if the intent is to deprecate the use of this
term because no one knows what it means and other, adequate,
terminology exists, that should be explicit.  Otherwise, this
section should either define what the term means or drop it:
the history of the term is, by itself, really not helpful or
consistent with the apparent scope of the document or "BCP"
assertions.

Disagree. It appears in an STD document without definition, and there was no later agreement on a definition. That fact is worth bringing up so no one thinks that they know what the STD means.

"Priming": Does the I-D want/need to mention CLASS here?  If a
resolver "performs queries for a name, type, and class" (see
"Resolver" definition), then this description of priming,
especially its relationship to _the_ (emphasis added) DNS root
zone, is not adequate.  Also see note on "Root zone" below.

The term "priming" applies to any class, so the current definition is fine as-is. There really isn't a good reason to bring up "but there might be other classes" in every definition that commonly is talking about IN.


"Secondary server" and "Primary server": Consistent with the
general documentation approach in this I-D, it may be worth
mentioning that "primary", "secondary", and "master" all appear
in RFC 1034.  For better or worse, "slave" does not.

Sure.

I was surprised to not see "lame delegation" or "split horizon"
in this section, the latter possibly as part of "View".

No one in the WG asked for them. If we are going to add them now, we would need to get WG consensus on the wording, which could be problematic in both cases.

_Section 6_

"Zone": See the comments about CLASS above.  As a terminology
document, this should be clear about whether "Zone with records
associated with different CLASSes" is even meaningful and, if
so, what it means.  Note that the old issue of the meaning of
QCLASS=ANY interacts with that question.  A different way to
state the issue is as a question: If a master file for a zone (a
term used, with small variations, several times in RFC 1034 but
not mentioned in this I-D) contains records that include several
different CLASSes, are those records all part of the same zone
or do they define different zones (noting that different
comments in RFC 1034/1035 can be read to give different
answers)?  If the I-D is going to touch CLASSes, I don't see how
to avoid these issues.

I do. You are asking for definitions for things that have not been defined before. If the RFCs are not clear on whether or not a master file can have X, it is better to update those RFCs, not make up a new term with a definition that is clearer.

"Child" and "Parent": With all respect to RFC 7344, the
sentence quoted under "Child" is unlikely to be comprehensible
to anyone who doesn't already know what the term means.  The
one in "Parent" isn't much better and is almost circular.  Note
that "parent zone" and "children zones" (sic) are described and
used in RFC 1034 and the usage there seems more clear than
these definitions.  Also note that 1034 uses "subzone" which is
also used in this document but, like, "subdomain" is not
defined in the I-D.  Because, as hinted above, the question of
whether "subdomain" is actually a synonym of "subzone" can be
important, defining the terms and making any important
distinctions also seems important.  Also see "Empty
non-terminals" below.

The WG discussed these definitions and came to this. If you have specific better definitions, but all means suggest them and we can see if the WG likes them better.


"Origin" and elsewhere, particularly "Apex": The document needs
to clarify that terms like "top" and "apex" are related to the
hierarchical structure of the DNS, not other ways in which DNS
information can be organized (e.g., master files or DNSSEC
canonizalization).  If the document is to be used normatively,
i.e., describes Best Current Practice, it isn't acceptable to
say 'These days, this sense of "origin" and "apex" (defined
below) are often used interchangeably' without making a
recommendation as to whether that is a good idea.

We agreed above that this doc is no longer BCP. When we do the -bis as BCP, it would be good to hear your guidelines for what a BCP definitions document must say for definitions like this. [tag1]


"Apex": If "top" is really a synonym, that should be said here.
If not, "top" should be deprecated or identified as historical,
informal, and/or non-preferred terminology.  Note that the
definition of "Authoritative data" uses "top node" (from
1034).

"Top" is not a synonym for "apex", but "top node of the zone" is. That's what the draft says.

"Glue": The last sentence of this subsection strongly implies
that the terminology introduced in RFC 2181 is not preferred
for general use.  If that is the intent, please be explicit
about it; otherwise this document specifies two inconsistent
definitions for the same term.  I believe that latter is
inconsistent with publication as a BCP.

See [tag1] above.


"In-bailiwick": I have to admit this term (and its opposite)
are new to me.  I haven't figured out from the definition why
adding it adds clarity rather than more confusion.  See the
comments above about two, inconsistent, definitions and
publication as a BCP, unless the "best" current practice is
inconsistent usage.

See [tag1] above.

"Authoritative data": If this is as ambiguous as the definition
/ statement suggests, is the Best Current Practice to not use
the term?  If so, say so.  If not, say why and where its use
should be considered acceptable.

See [tag1] above.

"Empty non-terminals": this definition strongly suggests that
"subdomain" is a property of owner name structure and not of
zone relationships.  If that is the intent, it is _really_
important to define and clarify "subdomain".

Yep; see the proposed definition above.

"Occluded name": I don't think "in a limbo" is helpful as part
of this definition.  The nodes are just hidden and inaccessible
for resolution (i.e., inaccessible except as part of zone
transfers or to the authoritative servers themselves) until and
unless the occluding RRs are removed.  Perhaps "in purgatory" (a
state that might be easier to escape if enough virtuous acts
occur with respect to the zone), but I really can't recommend
going there.

We would prefer to quote standards track documents directly here.

"Fast flux DNS": Is "domain is bound in DNS" really intended
here or was that a typographical error for "... is found...".

"found". Will fix with a note that the quoted text has a typo.

If
it was intentional, as I realize it was, it is confusing and a
definition of "bound in DNS" is required.  Hmm. I just read the
relevant section of RFC 6561 and now know what was intended.
This is still a horrible sentence that I'm astonished got by the
RFC Editor.  Perhaps it would be a lot more clear if the I-D
said something like "This occurs when a domain is bound to
multiple address records (and hence multiple IP addresses), each
of which has a very short Time-to-Live (TTL) value associated
with it".  It could, however, be simplified even further, and
made more consistent with the terminology used elsewhere in this
I-D, if it talked about owners and RRsets, noting that "one
RRset, one TTL value" rule implies that the TTLs cannot be
different very short values as the sentence of 6561 implies.
IMO, the I-D doesn't need to quote 6561 when it is clearly
confusing (and as the last sentence indicates, fairly clearly
too limiting); it should be entirely reasonable to paraphrase it
in a way that is obviously consistent but more clear.

        (That sentence of 6561 probably deserves an erratum
        suggesting a clarified form.  I hope someone will file
        it.)

I'll do so.

_Section 7_

"Registry": This definition treats the Registry as both the
operational entity and the related policy entity.  As one goes
further down the tree, that is often the case, but, nearer the
apex of the DNS tree, the policy entity ("whoever decides what
data goes into a zone") may be very separate from the entity who
is often described as the "Registry operator" (a term that
probably belongs in this section or may, at the risk of more
confusion, be a synonym for "DNS Operator" (see below)).  There
may also be policy entities separate from the Registry who may
impose significant constraints on the names that can be placed
in the zone (including, btw, the IETF, especially if we continue
on the Special Names path).  One can avoid making those
distinctions by indicating that the definitions in the document
are written from a DNS point of view, but then a definition of
"Registrar" (or EPP, etc.) are probably not needed.

Agree that adding the "from a DNS point of view" for "registry". Disagree about "registrar" not being needed: it is a commonly-used term.

Note that
the document's definition of "DNS operator" interacts with this
distinction and reinforces the need for it.  It could even say
that, for some zones, the registry function actually consists of
a DNS operator and one or more relevant policy authorities who
decide what names the DNS operator may add, maintain, etc. See
also below.

Good addition.


"WHOIS": The I-D might take the description a step further and
explicitly say that the term "Whois data" or "Whois database"
is used as a synonym for "registry database" or "registrant
database" (regardless or whether accessed via the Whois
protocol or not), but that usage should be discouraged because
it causes confusion now and will cause more if, as the IETF
hopes, Whois is eventually replaced for Registry Data and
similar applications.

Good point.

Also note that, while the original
application of Whois has largely fallen out of use, the
protocol is commonly used to access address registry databases,
databases that are not usually considered DNS databases (the
contents of the reverse-mapping zones subsidiary to ARPA. not
withstanding).

Yep.

"DNS Operator": This definition is a little convoluted.  A
Registry Operator is almost certainly associated with a
Registry ahd hence operates (or contracts out the operation of)
servers that are authoritative for at least one zone.  But a
DNS Operator may be operating recursive servers and not operate
any servers that are authoritative for anything.  I think the
readers should understand that before getting to the end of
this definition and don't believe they will, especially because
"registry operator" is not defined and this definition starts
out by talking about authoritative servers.

This was discussed in the WG (with me agreeing with your desire about recursive operators), but the rough consensus was with what is here now.

_Section 8 and 9_

I hope that the number of issues identified in this review will
stimulate someone much more expert on DNSSEC than I am to
carefully review these sections; I have not attempted to do so.
However, I note that, in line with some of the comments above,
this section appears to contain several normative, updating,
clarifications of the various specifications which may not be
appropriate for a BCP terminology document.

See [tag1] above. Also, discussions about DNSSEC are as prone to definition creep as discussions about DNS.

Section 9 presents two sets of definitions that it says "differ
a bit", but makes no recommendations.  This would be entirely
reasonable in an Informational document whose purpose was to
warn readers that either definition might be in use and they
needed to exercise care.  It would be equally appropriate as an
informational explanation intended to lay the foundation for a
WG to clean up the mess by identifying one definition from each
pair as authoritative.  It is less reasonable in a BCP, at least
without a careful explanation about how inconsistent definitions
can both be "best practice".  In any of those cases, it would be
more useful if the definitions were formatted to allow the
reader to efficiently discover the differences: side by side,
one term and both definitions before the next term, or something
else that does not put the two sets of definitions on separate
pages (or worse as we move to alternate RFC presentation forms).

See [tag1] above.

Despite my sense that this draft still needs a lot of work, a
serious effort to get DNS terminology clarified and under
control is long overdue and I thank the authors and WG for
taking it on.

When Andrew and I started this, I hilariously, naïvely said that this would be easier than the i18n terminology document.

--Paul Hoffman

<Prev in Thread] Current Thread [Next in Thread>