ietf
[Top] [All Lists]

Re: Last Call: An IETF URN Sub-namespace for Registered Protocol Parameters to BCP

2002-07-03 19:00:00
Keith,

we could bat these arguments back and forth without making any further progress. I think I have real use-cases for this proposal for which the concerns you raise just don't seem to be a problem. On the other hand, I can recognize a legitimate concern with:

[[
But neither do I want to give official blessing to folks to re-cast
traditional IETF protocols into new syntactic forms.  And a lot of
the interest I've seen in having URI equivalents for IETF protocol
parameter names was from people who wanted to do just that - often
with the explicit intent of producing variant implementations in order
to disrupt the installed base.
]]

I think a more constructive way forward would be to work on an appropriate form of words indicating the concerns and purposes for which these IETF URNs SHOULD NOT be used.

I perceive that your concerns relate mostly to attempts to transplant entire protocol structures from one framework to another, and in those cases I agree that it is difficult to transfer all the semantics faithfully. But this proposal is aimed more at using information about protocol elements, the kind of detail that is described in a registry entry, into a different application environment.

Examples:

IETF CONNEG and W3C CC/PP have quite different overall structure and semantics, but they can still usefully employ common media feature definitions. This is clearly not an attempt to recast CONNEG into XML, but it seems highly desirable to not end up with two almost parallel sets of media features when just one set would do fine.

My other example, which may seem closer to your concern but which really is not, is to do with storing and processing message metadata. Here, I want to be able to use URIs to identify the various header fields that appear in a message. Again, the goal is not to re-cast an IETF protocol in a different form, but to take information from an IETF protocol for use in a different application.

#g
--

PS: as far as I can tell, according to http://www.ietf.org/internet-drafts/draft-sun-handle-system-def-05.txt (for which I saw the announcement after sending my previous message), the handle system does operate federation of naming authorities:

        <Handle>          = <NamingAuthority> "/" <LocalName>

        <NamingAuthority> = *(<NamingAuthority>  ".") <NAsegment>

        <NAsegment>       = 1*(%x00-2D  /  %x30-3F / %x41-FF )
                          ; any octets that map to UTF-8 encoded
                          ; Unicode 2.0 characters except
                          ; octets '0x2E' and '0x2F' (which
                          ; correspond to the ASCII characters '.',
                          ; and '/').

        <LocalName>       = *(%x00-FF)
                          ; any octets that map to UTF-8 encoded
                          ; Unicode 2.0 characters

and:

    Naming authorities are defined in a hierarchical fashion resembling
    a tree structure. Each node and leaf of the tree are given a label
    that corresponds to a naming authority segment (<NAsegment>). The
    parent node represents the parent naming authority. Naming
    authorities are constructed left to right, concatenating the labels
    from the root of the tree to the node that represents the naming
    authority. Each label (or its <NAsegment>) is separated by the
    character '.' (octet 0x2E). For example, the naming authority for
    the Digital Object Identifier (DOI) project is  "10". It is a root-
    level naming authority as it has no parent naming authority for
    itself. It can, however, have many child naming authorities, e.g.,
    "10.1045" which is used as a naming authority for D-Lib Magazine.




At 05:02 PM 7/3/02 -0400, Keith Moore wrote:
> > > Spurred by XML and related technologies (which I assert are far more than > > > mere "fashion") we are seeing URIs used for a wide range of purposes which > > > are not constrained by a requirement for dereferencing. The use of URIs
> > > for identifying arbitrary things is now a fact of life, and in some
> > > technical domains is providing to be extremely useful. You claim "harm",
> > > but I recognize no such harm.
> >
> >Clarification: I claim "harm" for the proposed use of *URNs* because
> >URNs were designed to be long-term stable names for (at least potentially)
> >network-accessible resources, whereas the proposal is to use them as a
> >way of generating globally unique strings like UUIDs or OIDs.
>
> I still don't see the "harm" here.

basically, it's trivializing them.  they're overkill for this purpose, and
using URns for this purpose makes them seem less useful than they really
are.

that, and I think there'll be a very strong demand for them to be human-readable (i.e. to have visible structure) and syntactically derivable from the canonical
name for the protocol elements (for those that have such names).

> >I'm all for reuse of data models where it makes sense, but if the goal
> >is really to "lock the various syntactic forms to a common semantic
> >definition" (presumably one which is compatible with XML) then I take
> >strong issue with that, as the XML model is quite dysfunctional for
> >many purposes.  (as are the others, it's just that XML is the current
> >bandwagon)
>
> I'm puzzled -- you appear to be arguing my point. Yes, different syntactic
> frameworks will (in isolation) tend to yields differing semantics.  Yes,
> different syntactic frameworks are better suited for different
> purposes.  But it seems to me that referring different uses to the same
> original definition would help to inhibit that -- and if factors like
> ordering or grouping are significant, then the definition will (hopefully)
> capture that and place constraints on the syntactic contexts for re-use.

I just don't happen to share your faith in this as a mechanism to inhibit
or discourage semantic drift.  In every example I can think of where one
data model is exported into a different context there has been semantic
drift, even when the same names and official definitions were retained.
(maybe there's less drift this way, maybe not - but it certainly doesn't
inhibit drift)

> >Using URIs for the names of the data elements won't stop that kind of drift.
>
> But not trying to re-use existing definitions seems to be a recipe for
> Balkanization.

I don't know how to avoid Balkanazation.  Sometimes it seems better to
let data models fork rather than to try to reconcile various differences -
I'd cite RFC822, usenet, HTTP, and SIP as a good example of things that
we shouldn't pretend have the same protocol elements even though
we recognize that they share a common ancestry.

> Maybe it won't work for all applications, but I think there are a
> substantial number of cases where re-use of existing definitions is a
> reasonable and desirable goal.

I don't claim that re-use of a data model is not potentially useful.
If nothing else, an existing data model can serve as a useful starting
point for a new data model when the requirements or syntactic structures
dictate not using the old one.

But neither do I want to give official blessing to folks to re-cast
traditional IETF protocols into new syntactic forms.  And a lot of
the interest I've seen in having URI equivalents for IETF protocol
parameter names was from people who wanted to do just that - often
with the explicit intent of producing variant implementations in order
to disrupt the installed base.

> >But neither do we have to endorse it just so they will use our stuff.
> >Especially when their using our stuff dilutes the utility of our stuff
> >by not requiring widespread agreement on the media features used.
>
> Come again?  That seems to me to be entirely non-sequitur.  How can other
> people using out stuff dilute its utility?  It is precisely in the nature
> of this proposal that using these URIs would be assenting to the IETF
> definition of their meaning.

no it's not, because of the semantic drift that will occur.

Someone once tried to demonstrate to me that it was perfectly reasonable
to express iCalendar events in XML - but her demonstration used XML's
date representation which didn't have a proper concept of timezones.
Interpretation of dates in iCalendar were dependent on a separate timezone
element, whereas the XML tool wanted to treat those dates as standalone.
so the "obvious" conversion of iCalendar to XML - even though the elements
mapped one-to-one - caused semantic drift and a loss of important
functionality.

> > > This URN namespace proposal will provide a way to incorporate
> > > the IETF feature registry directly into the W3C work, in a way which is
> > > traceable through IETF specifications. Without this, I predict that the
> > > parties who are looking to use the W3C work (notably, mobile phone
> > > companies) will simply go away and invent their own set of media features,
> > > without any kind of clear relationship to the IETF features.
> >
> >The w3c approach is encouraging them to do this anyway, by having
> >all media features be URIs that anyone can create/assign without any
> >agreement from anyone else.
>
> So we should roll over and play dead, and pretend that interoperability
> doesn't matter?

It's not clear that doing things the w3c way helps interoperability.

> Actually, that's a misrepresentation of the W3C position, which is that
> vocabularies gain currency through use -- the more people who use them, the
> more useful, and more widely used they become.

That's true to a point, but it also seems to be the case that controlled
vocabularies that need to have consistent meaning across large groups
need very careful definition and, well, "control".  Natural languages,
by contrast, tend to drift continuously.  Sometimes that's useful, but
perhaps not as useful for computer protocols as for humans that can
intuitively accomodate a certain amount of semantic skew.

> >The likely consequence of what is being proposed is for the URIs that we
> >define to mean nearly, but not quite, the same thing as an IETF protocol
> >parameter - but we have to try to pretend that they mean the same thing.
> >And it will degrade interoperability.
>
> Er, no:  we *define* them to mean the *same* thing.  If implementations
> play fast and loose with the defined meaning, that's nothing new.

At the same time, by explicitly exporting them we are encouraging
semantic drift.

> >The very temptation to treat URNs as if they were as malleable as other
> >URIs is part of what makes this proposal dangerous.  Since I think that
> >URNs *will* be widely misused if they are used for protocol elements,
> >I'd far rather have IANA assign ordinary URIs for this - then we will
> >still get semantic drift but at least it won't dilute the value of URNs.
>
> In what sense are URNs not ordinary URIs?  They have particular
> requirements for persistence that are not shared by all URI schemes.

In order to make a URN persistent you really need to make them opaque
(or mostly so) to humans.   It's really too bad that we even allowed
URN namespace IDs to be human-meaningful, but that's water under the bridge.

> > > (i) have a framework for assigning identifier values, in such a way
> > that it
> > > is possible by some means for a human to locate its defining
> > > specification.  I can't see how to do this without exploiting a visible
> > > syntactic structure in the name.
> >
> >ISBNs do not have a visible syntactic structure, at least, not an
> >obvious one. But they're quite frequently used to look up book information.
>
> I understand that ISBNs aren't persistent -- they get reused.

They're not supposed to be, but it does happen in some countries -
particularly those with less ISBN space allocated to them.
So we have a NAT-like problem for ISBNs ...

> Anyway, ISBN's *do* have an internal syntactic structure.

I didn't say they didn't have one, I just said it was not obvious.

>
> > > (ii) have a framework for actually using the identifier in an
> > > application:  in this case, I agree that the identifier should
> > generally be
> > > treated as opaque.
> > >
> > > Also, I think (d) contradicts your goal (a):  I cannot conceive any
> > > scalable resolution mechanism that does not in some sense depend on
> > > syntactic decomposition of the name.
> >
> >You should really read up on the CNRI handle system then.  There are a lot
> >of things I don't like about it but it really was designed to have exactly
> >this property.
>
> Based on a December 2001 article
> (http://www.dlib.org/dlib/december01/blanchi/12blanchi.html), it seems to
> me that Handles too depend on some syntactic structure to partition the
> search space -- based on dynamic content types and metadata schema.

Handles have evolved a bit since first envisioned - as I understand it the
problem wasn't the inability of the non-partitioned search service to scale
up to the number of queries but rather the difficulties associated with
everybody trusting a centrally maintained flat search service.

Someone from cnri might be able to fill in more detail.

> Ah yes, and according to the internet draft on handles:
>    http://www.ietf.org/internet-drafts/draft-sun-handle-system-09.txt
> there *is* a clear syntactic structure:

Yes, but the searching isn't (didn't used to be) federated according to that
structure.  The scalability of the searching didn't depend on it -
federating actually slowed things down unless you happened to consult the
right server first.  (locality does affect search speed)

> But I think the general idea still holds here -- if you
> want to reliably and quickly dereference an identifier with Internet scope,
> it cannot be completely opaque.)

Hashing is faster than tree searching, especially if the tree is distributed.
you federate the lookup because of trust issues (which are a kind of scaling
issue, but not in terms of bandwidth or cpu cycles) and ease-of-cost-recovery
issues, not to make the lookup more efficient or cheaper.

Keith

-------------------
Graham Klyne
<GK(_at_)NineByNine(_dot_)org>