Re: Last Call: An IETF URN Sub-namespace for Registered Protocol Parame

At 10:57 AM 7/3/02 -0400, Keith Moore wrote:

> Spurred by XML and related technologies (which I assert are far more than
> mere "fashion") we are seeing URIs used for a wide range of purposes which
> are not constrained by a requirement for dereferencing.   The use of URIs
> for identifying arbitrary things is now a fact of life, and in some
> technical domains is providing to be extremely useful.  You claim "harm",
> but I recognize no such harm.

Clarification: I claim "harm" for the proposed use of *URNs* because
URNs were designed to be long-term stable names for (at least potentially)
network-accessible resources, whereas the proposal is to use them as a
way of generating globally unique strings like UUIDs or OIDs.


I still don't see the "harm" here.

Another way to look at this might be: they all have potentiallynetwork-retrievable representations, but not all uses depend on being ableto perform the retrieval.

> Having different syntactic contexts in which names are used willinevitably

> lead to different syntactic name forms.  I submit that the real challenge
> here is not to prevent the use of varying syntax, but to lock the various
> syntactic forms to a common semantic definition

Oddly enough, having different syntactic contexts also tends to cause
differences in semantic definition.  In one syntactic context order
of elements can be significant whereas it's not in the other. one syntactic

context is designed to allow individual components to be accessedindependently

of the others while another expects the entire resource description to
be available to the consumer.  One makes it easy to group related
items; another doesn't have a way of representing relationships between
items.  The semantic definitions tend to be influenced by these factors.

I'm all for reuse of data models where it makes sense, but if the goal
is really to "lock the various syntactic forms to a common semantic
definition" (presumably one which is compatible with XML) then I take
strong issue with that, as the XML model is quite dysfunctional for
many purposes.  (as are the others, it's just that XML is the current
bandwagon)

I'm puzzled -- you appear to be arguing my point. Yes, different syntacticframeworks will (in isolation) tend to yields differing semantics. Yes,different syntactic frameworks are better suited for differentpurposes. But it seems to me that referring different uses to the sameoriginal definition would help to inhibit that -- and if factors likeordering or grouping are significant, then the definition will (hopefully)capture that and place constraints on the syntactic contexts for re-use.

> -- in this case, providing

> a way to create syntactic URI forms that can be bound to protocolsemantics

> in a way that inhibits semantic drift between the different forms.

But such drift is almost inevitable.  You can't recast some existing
data structure in XML and use it widely and expect the meanings of the
protocol elements to stay the same.  And in essentially every example I've
seen of an attempt to do this, the meanings of the protocol elements are
changed subtly from the very beginning, usually by trying to use XML
structure to represent relationships that aren't explicit in the original
data model.  More generally, an XML representation of a data model will get
used differently than the original representation, and the semantics of the
individual protocol elements will almost certainly drift as a result.

(Actually this happens even when you use the same representation.
RFC 822 headers had subtly different meanings on BITNET than on
the Internet, because there were enough differences in the two user
communities and the mail reading programs used by those communities.
Similarly, casting a data model into XML means that a different set
of tools will be used to access/manipulate that data - indeed that
is the entire point of doing so - but this *will* cause semantic drift
in the data model between the two environments)

Using URIs for the names of the data elements won't stop that kind of drift.

But not trying to re-use existing definitions seems to be a recipe forBalkanization.

Maybe it won't work for all applications, but I think there are asubstantial number of cases where re-use of existing definitions is areasonable and desirable goal. I have two ongoing projects for which Iwould really like to see this URN namespace proposal approved:


(a) Distributed storage and analysis of email and other message metadata.

(b) common feature descriptions for IETF/W3C content negotiation efforts.

> One of the motivating factors in this work (for me, at least, and I think
> for others) has been to draw together some of the divergent strands of
> thinking that are taking place in the IETF and W3C.  W3C are fundamentally
> set on a course of using URIs as a generic space of identifiers. IETFhave
> a number of well-established protocols that use registries to allocate
> names. Neither of these are going to change in the foreseeablefuture. So> do we accept a Balkanization of Internet standards efforts, or do wetry to
> draw them together?

Some things don't mix very well, even if they are quite useful individually.
The traditional examples are oil and water.

That seems like a non-argument for opposing this proposal. Even emulsionshave their uses.

> A particular case in point is content negotiation.  The IETF have prepared
> a specification for describing media features that uses a traditional form
> of IANA registry to bind names to features.  In parallel with this, W3C
> have prepared a specification which has some similar goals, but which uses
> URIs to represent media features, and relies on the normal URI allocation
> framework to ensure the minting of unique names as and when needed.  (I
> have some reservations about this, but that can't change what is actually
> happening.)

But neither do we have to endorse it just so they will use our stuff.
Especially when their using our stuff dilutes the utility of our stuff
by not requiring widespread agreement on the media features used.

Come again? That seems to me to be entirely non-sequitur. How can otherpeople using out stuff dilute its utility? It is precisely in the natureof this proposal that using these URIs would be assenting to the IETFdefinition of their meaning.

> This URN namespace proposal will provide a way to incorporate
> the IETF feature registry directly into the W3C work, in a way which is
> traceable through IETF specifications.   Without this, I predict that the
> parties who are looking to use the W3C work (notably, mobile phone
> companies) will simply go away and invent their own set of media features,
> without any kind of clear relationship to the IETF features.

The w3c approach is encouraging them to do this anyway, by having
all media features be URIs that anyone can create/assign without any
agreement from anyone else.

So we should roll over and play dead, and pretend that interoperabilitydoesn't matter?

Actually, that's a misrepresentation of the W3C position, which is thatvocabularies gain currency through use -- the more people who use them, themore useful, and more widely used they become. (Sure, that's ageneralization.) This approach seems to be very much in the spirit of theIETF I've been participating in over the past few years -- it's not ourrole to decide what will and will not work, but to provide an environmentin which new technologies can evolve and find currency, and promoteinteroperability wherever we can.

> In summary: URIs *will* be used to identify protocol parameters. TheIETF

> cannot prevent that.  What the IETF can do by supporting a particular form
> of such use is to try and ensure that such use remains bound by a clear,
> authoritative chain of specifications to the IETF specification of what

> such parameters mean. The harm that comes from not doing this, in myview,

> is that we end up with a multiplicity of URIs that mean nearly, but not
> quite, the same thing as an IETF protocol parameter.  That outcome, I
> submit, cannot be good for longer term interoperability between IETF and
> other organizations' specifications.

The likely consequence of what is being proposed is for the URIs that we
define to mean nearly, but not quite, the same thing as an IETF protocol
parameter - but we have to try to pretend that they mean the same thing.
And it will degrade interoperability.

Er, no: we *define* them to mean the *same* thing. If implementationsplay fast and loose with the defined meaning, that's nothing new.

> >d) embed NO visible structure in the URNs - just assign each
> >    parameter value a sequence number.  people who want to use
> >    those URNs in XML or whatever would need to look them up at IANA's
> >    web site.
>

> I disagree. This requirement actively works against one of themotivations

> for using URIs in application data formats;  that there be a scalable
> framework for different organizations and persons to mint their own
> identifiers.

The fact that people want to use URIs in this way does not mean that it's
appropriate to use URNs in this way.  If people want to mint their own URNs,
then they have to follow the rules for URNs.  Those rules *do not*
permit arbitrary organizations and persons to mint their own identifiers
without explicit delegation from a URN namespace, for very good reasons
which are consistent with URNs' purposes.

Ah, that's a misunderstanding. One of the reasons I favour using URNs inthis way (and contrary to the often touted W3C position) is that itprovides a form of URI that is clearly *not* minted by any Tom, Dick orHarry working in isolation. The definition of any urn:ietf:... URI issubject to the IETF consensus process, so can be expected to have beeninvolved in some level of community review. My point here was that,because they conform to a common URI syntactic framework, they can be usedinterchangeably in some contexts with experimental and private-useidentifiers. (In a sense, this might be viewed as a converse of theX-header approach: arbitrary URIs may be treated as experimental orprivate use, unless they are allocated within a URI namespace controlled bya recognized authority in the area of their application.

The very temptation to treat URNs as if they were as malleable as other
URIs is part of what makes this proposal dangerous.  Since I think that
URNs *will* be widely misused if they are used for protocol elements,
I'd far rather have IANA assign ordinary URIs for this - then we will
still get semantic drift but at least it won't dilute the value of URNs.

In what sense are URNs not ordinary URIs? They have particularrequirements for persistence that are not shared by all URI schemes. Andthere is a requirement for "location independence", but what that meansisn't always clear.

But mainly, the goal of this proposals is emphatically *not* to make URNs"malleable" (in the sense of, say, http: URIs which can be reassigned atwill by domain owners), but to allow the introduction of some URIs that canclearly be seen to be stable and persistent.

I'd be happy for IANA to assign "ordinary URIs", assuming that by this youmean something like http://www.ietf.org/..., as long as there was a clearorganizational commitment that such a URI, once allocated, would never bereallocated for any other purpose. It's the particular properties of URNsthat are desired here, not any sense that they are somehow a "special" formof URIs.

> To use an identifier, one must:
>

> (i) have a framework for assigning identifier values, in such a waythat it

> is possible by some means for a human to locate its defining
> specification.  I can't see how to do this without exploiting a visible
> syntactic structure in the name.

ISBNs do not have a visible syntactic structure, at least, not an
obvious one.  But they're quite frequently used to look up book information.

I understand that ISBNs aren't persistent -- they get reused. How manybooks are "in print" at any time? I don't think this is quite Internet scale.

Anyway, ISBN's *do* have an internal syntactic structure. Fromhttp://www.isbn.org/standards/home/isbn/us/isbnqa.asp#Q4:


[[
Does the ISBN have any meaning imbedded in the numbers?

The four parts of an ISBN are as follows:

Group or country identifier which identifies a national or geographicgrouping of publishers;

Publisher identifier which identifies a particular publisher within a group;
Title identifier which identifies a particular title or edition of a title;

Check digit is the single digit at the end of the ISBN which validates theISBN.

]]

> (ii) have a framework for actually using the identifier in an

> application: in this case, I agree that the identifier shouldgenerally be

> treated as opaque.
>
> Also, I think (d) contradicts your goal (a):  I cannot conceive any
> scalable resolution mechanism that does not in some sense depend on
> syntactic decomposition of the name.

You should really read up on the CNRI handle system then.  There are a lot
of things I don't like about it but it really was designed to have exactly
this property.

Based on a December 2001 article(http://www.dlib.org/dlib/december01/blanchi/12blanchi.html), it seems tome that Handles too depend on some syntactic structure to partition thesearch space -- based on dynamic content types and metadata schema. (Ishould be clear that I'm using the term syntactic structure in an abstractsense, a la McCarthy(http://www-formal.stanford.edu/jmc/towards/node12.html#SECTION000120000000000000000),rather than in the sense of a specific arrangement of characters.)


Ah yes, and according to the internet draft on handles:
  http://www.ietf.org/internet-drafts/draft-sun-handle-system-09.txt
there *is* a clear syntactic structure:
[[
 2. Handle Namespace

    Every handle consists of two parts: its naming authority, otherwise
    known as its prefix, and a unique local name under the naming
    authority, otherwise known as its suffix. The naming authority and
    local name are separated by the ASCII character "/". A handle may
    thus be defined as:

      <Handle> ::= <Handle Naming Authority> "/" <Handle Local Name>
 ]]

How each naming authority deals with scaling within its domain of authoritydoesn't seem to be specified.

(Actually, when I wrote the above, I later realized that I misspokeslightly, because some systems work in constrained contexts -- I wasreferring to systems operating at global Internet scale without furthercontextualization. But I think the general idea still holds here -- if youwant to reliably and quickly dereference an identifier with Internet scope,it cannot be completely opaque.)


#g


-------------------
Graham Klyne
<GK(_at_)NineByNine(_dot_)org>

Re: Last Call: An IETF URN Sub-namespace for Registered Protocol Parameters to BCP