ietf
[Top] [All Lists]

Re: [art] Artart last call review of draft-ietf-core-links-json-07

2017-04-25 16:27:34
RFC 6690 says:

 In
 order to convert an HTTP Link Header field to this link format, first
 the "Link:" HTTP header is removed, any linear whitespace (LWS) is
 removed, the header value is converted to UTF-8, and any percent-
 encodings are decoded.

Well, that's broken.

OK, let me start typing that errata report then.

coap://example.com?stupid%3Dkey=4711

is not distinguishable from

coap://example.com?stupid=key=4711

(The typical reaction of an implementer is “then don’t do that!” [1,2].)

That isn't a "limitation”.  

For RFC6690 users, it pretty much is, because certain URIs don’t work.
They tend to design their URIs in such a way that they do, probably more so 
because these designs are natural for them than because they are fully aware of 
that limitation.

It's a bug to decode pct-encoded octets in
a URI before decomposing the reference into its parts.  

Well, percent-encoding is playing two roles in RFC 3986: hiding characters 
within syntactic elements from their delimiter roles, and encoding non-ASCII 
(and C0 etc.) characters.
The passage I cited from RFC 6690 got nicely rid of the latter, and broke the 
former(*).

ASCII is already
in UTF-8.  Decoding a pct-encoding doesn't make it "more UTF-8"; it just
means the string is no longer a URI reference.  That's broken.  So utterly
broken that it obviously wasn't reviewed by the right people.

So what should I write into the errata report?

Or more generally speaking, how should we fix RFC 6690, without creating a need 
for constrained nodes to do full URI processing?

Maybe it is sufficient to document the limitation in the errata, for now?

And, more to the point of the subject line, how should we handle this on the 
JSON/CBOR level?

There definitely will be a round-tripping problem with RFC 6690 if the URIs 
collide with the above limitation of RFC 6690.  But that’s OK because that 
defines the subset.

To be more general, not doing any percent-decoding of URIs when creating 
JSON/CBOR from scratch is probably the easy way, but it means that when we want 
to phase out RFC 6690 on the constrained level by replacing it with JSON/CBOR, 
there is additional complexity.  Horribile dictu, but maybe IRIs are the right 
thing to do here.

Grüße, Carsten

(*) It may be worth pointing out that the amount of breakage here is much 
larger than for CoAP itself, which does the percent-decoding only after 
decomposing a URI into what CoAP considers to be its components, so the URI 
parsing works properly — coap://example.com/foo%2fbar has one path segment, 
“foo/bar”.
But the application semantics of hiding application delimiters, which my 
example above is breaking, is not supported in CoAP either.
Some people think that URIs should be carried around in that decomposed form 
throughout the constrained space, and I can’t blame them.
I don’t have data how many URI libraries in active use in the non-constrained 
space get this particular detail right, either.