Re: What is the right way to do Web Services discovery?


In message 
<CAMm+LwgtJuLdL_RKJNSVNGODGj8D25nfj0jkhnBLFS=aaXG+rA(_at_)mail(_dot_)gmail(_dot_)com>
, Phillip Hallam-Baker writes:


I am asking here as there seems to be a disagreement in HTTP land and DNS
land.

Here are the constraints as I see them:

0) Foir any discovery mechanism to be viable, it must work in 100% of
cases. That includes IPv4, IPv6 and either with NAT.

1) Attempting to introduce new DNS records is a slow process. For practical
purposes, any discovery mechanism that requires more than SRV + TXT is not
going to be widely used.


Absolute total garbage.

Introducing a new DNS record isn't slow.  It take a couple of weeks.
Really.  Thats how long it takes to allocate a code point.

RFC 1034 compliant recursive servers and resolver libraries should
handle it the moment you start to use it.  RFC 1034 bans compression
points in non well known types and records have a length field so
that they can be treated as opaque objects by recursive servers and
resolver libraries.  If your vendor does not a ship RFC 1034 compliant
recursive server or resolver library file a bug report and/or move
to a platform that is compliant and/or find a alternate resolver
library and use it.  There are plenty of open source resolver
libraries out there.  Some of them are 2+ decades old now that do
this right.

Authoritative servers can serve the new record immediately if they
support unknown record types which is now over a decade old (2003).

If you want to be able to use the presentation format there are
authoritative servers that are designed to make it easy to add new
record types.  This was the only step that was slow originally once
the code point was allocated.

If your DNS hosting service doesn't support unknown record types
file a bug report and find one that does or host the DNS service
yourself.

2) Apps area seems to have settled on a combination of SRV+TXT as the basis
for discovery. But right now the way these are used is left to individual
protocol designers to decide. Which is another way of saying 'we don't have
a standard'.

3) The DNS query architecture as deployed works best if the server can
anticipate the further requests. So a system that uses only SRV+TXT allows
for a lot more optimization than one using a large number of records.

4) There are not enough TCP ports to support all the services one would
want. Further keeping ports open incurs costs. Pretty much the only
functionality from HTTP that Web Services make use of is the use of the URL
stem to effectively create more ports. A hundred different Web services can
all share port 80.

5) The SRV record does not specify the URL stem though. Which means that
either it has to be specified in some other DNS record (URI or TXT path) or
it has to follow a convention (i.e. .well-known).

6) Sometimes SRV records don't get through and so any robust service has to
have a strategy for dealing with that situation.

7) If we are going to get to a robust defense against traffic analysis, it
has to be possible to secure the initial TLS handshake, i.e. before SNI is
performed. This in turn means that it must be possible to pull information
out of that exchange and into the DNS. Right now we don't know what that
information is but this was not a use case considered by DANE.

8) We are probably going to want to transition Web Services to 'something
like QUIC' in the near future. Web Services really don't need a lot more
than a TCP stream. Most of HTTP just gets in the way. But the multiplexing
features in QUIC could be very useful.




Right now we have different ideas on how this should work in the HTTP space
and DNS space. And this appears to be fine with the two groups as they
don't need to talk to each other. But it really isn't possible to build
real systems unless you offend the purists in at least one camp. I think we
should do better and offend both.

So here is my proposal for discovery of a service with IANA protocol label
'fred'


First the service description records. This is a TXT record setting policy
for all instances of the fred service and a set of SRV service
advertisements:

_fred._tcp.example.com TXT "minv=1.2 maxv=3"
_fred._tcp.example.com SRV 0 100 80 host1.example.com
_fred._tcp.example.com SRV 0 100 80 host2.example.com

There is also a set of round robin A records for systems behind legacy NAT.
You could do AAAA as well but these probably aren't needed as it is
unlikely that a router blocking SRV will pass AAAA

fred.example.com A 10.0.0.1
fred.example.com A 10.0.0.2

And finally, we have the host description entries

host1.example.com A 10.0.0.1
_fred._tcp.host1.example.com TXT "minv=1.2 maxv=2 tls=1.2 path=/fred12"
host2.example.com A 10.0.0.1
_fred._tcp.host2.example.com TXT "tls=1.3"

So here we have some host level service description tags which obviously
override the ones specified at the service level. With the proviso that a
client might well abort if the service level description suggests there is
no acceptable host. The path descriptor allows the use of the well known
service to be avoided on host1. It defaults on host2

In the normal run of things, a DNS server would recognize that a request
for _fred._tcp.example.com SRV was likely the start of a request chain and
send all the records describing the service in a single bundle. This should
usually fit in a single UDP response.

This approach gives us two levers allowing us to set policy for the
service. We can define policy for all service instances or granular per
host information.


The bit that I have not got nailed down is what the HTTP URL should be
after the service discovery is performed. My view is that they should be
these:

http://host1.example.com/fred12
http://host2.example.com/.well-known/fred

Which works nicely with the existing code and but not for TLS operations.
We will either need certs for host1.example.com and host2.example.com or
have to override the TLS stack to accept certs for example.com.

The problem becomes even more apparent if the redirects are to
host1.cloudly.com and host2.cloudly.com where cloudly is a cloud service
provider. So the alternative is to do this:


http://example.com/fred12
http://example.com/.well-known/fred

The problem is that it does not work well when trying to use this strategy
with existing http clients built into scripting languages. Instead of just
writing a module that does the SRV lookup and spits out the URLs and
attributes, now we need to rewrite our client so it will hit the right DNS
address.


Given that most libraries seem to have hooks to allow a client to make its
own TLS certificate path math choices, I am very strongly in favor of the
first approach. But I am willing to be persuaded otherwise.

Comments?

--001a1147010626c41e0541e517e6
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-size:small">I a=
m asking here as there seems to be a disagreement in HTTP land and DNS land=
.</div><div class=3D"gmail_default" style=3D"font-size:small"><br></div><di=
v class=3D"gmail_default" style=3D"font-size:small">Here are the constraint=
s as I see them:</div><div class=3D"gmail_default" style=3D"font-size:small=
"><br></div><div class=3D"gmail_default" style=3D"font-size:small">0) Foir =
any discovery mechanism to be viable, it must work in 100% of cases. That i=
ncludes IPv4, IPv6 and either with NAT.</div><div class=3D"gmail_default" s=
tyle=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"fo=
nt-size:small">1) Attempting to introduce new DNS records is a slow process=
. For practical purposes, any discovery mechanism that requires more than S=
RV + TXT is not going to be widely used.</div><div class=3D"gmail_default" =
style=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"f=
ont-size:small">2) Apps area seems to have settled on a combination of SRV+=
TXT as the basis for discovery. But right now the way these are used is lef=
t to individual protocol designers to decide. Which is another way of sayin=
g &#39;we don&#39;t have a standard&#39;.</div><div class=3D"gmail_default"=
 style=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"=
font-size:small">3) The DNS query architecture as deployed works best if th=
e server can anticipate the further requests. So a system that uses only SR=
V+TXT allows for a lot more optimization than one using a large number of r=
ecords.<br></div><div class=3D"gmail_default" style=3D"font-size:small"><br=

</div><div class=3D"gmail_default" style=3D"font-size:small">4) There are =

not enough TCP ports to support all the services one would want. Further ke=
eping ports open incurs costs. Pretty much the only functionality from HTTP=
 that Web Services make use of is the use of the URL stem to effectively cr=
eate more ports. A hundred different Web services can all share port 80.</d=
iv><div class=3D"gmail_default" style=3D"font-size:small"><br></div><div cl=
ass=3D"gmail_default" style=3D"font-size:small">5) The SRV record does not =
specify the URL stem though. Which means that either it has to be specified=
 in some other DNS record (URI or TXT path) or it has to follow a conventio=
n (i.e. .well-known).=C2=A0</div><div class=3D"gmail_default" style=3D"font=
-size:small"><br></div><div class=3D"gmail_default" style=3D"font-size:smal=
l">6) Sometimes SRV records don&#39;t get through and so any robust service=
 has to have a strategy for dealing with that situation.</div><div class=3D=
"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_def=
ault" style=3D"font-size:small">7) If we are going to get to a robust defen=
se against traffic analysis, it has to be possible to secure the initial TL=
S handshake, i.e. before SNI is performed. This in turn means that it must =
be possible to pull information out of that exchange and into the DNS. Righ=
t now we don&#39;t know what that information is but this was not a use cas=
e considered by DANE.</div><div class=3D"gmail_default" style=3D"font-size:=
small"><br></div><div class=3D"gmail_default" style=3D"font-size:small">8) =
We are probably going to want to transition Web Services to &#39;something =
like QUIC&#39; in the near future. Web Services really don&#39;t need a lot=
 more than a TCP stream. Most of HTTP just gets in the way. But the multipl=
exing features in QUIC could be very useful.=C2=A0</div><div class=3D"gmail=
_default" style=3D"font-size:small"><br></div><div class=3D"gmail_default" =
style=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"f=
ont-size:small"><br></div><div class=3D"gmail_default" style=3D"font-size:s=
mall"><br></div><div class=3D"gmail_default" style=3D"font-size:small">Righ=
t now we have different ideas on how this should work in the HTTP space and=
 DNS space. And this appears to be fine with the two groups as they don&#39=
;t need to talk to each other. But it really isn&#39;t possible to build re=
al systems unless you offend the purists in at least one camp. I think we s=
hould do better and offend both.</div><div class=3D"gmail_default" style=3D=
"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-size=
:small">So here is my proposal for discovery of a service with IANA protoco=
l label &#39;fred&#39;</div><div class=3D"gmail_default" style=3D"font-size=
:small"><br></div><div class=3D"gmail_default" style=3D"font-size:small"><b=
r></div><div class=3D"gmail_default" style=3D"font-size:small">First the se=
rvice description records. This is a TXT record setting policy for all inst=
ances of the fred service and a set of SRV service advertisements:</div><di=
v class=3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D=
"gmail_default" style=3D"font-size:small">_fred._<a href=3D"http://tcp.exam=
ple.com">tcp.example.com</a> TXT &quot;minv=3D1.2 maxv=3D3&quot;</div><div =
class=3D"gmail_default" style=3D"font-size:small">_fred._<a href=3D"http://=
tcp.example.com">tcp.example.com</a> SRV 0 100 80 <a href=3D"http://host1.e=
xample.com">host1.example.com</a><br></div><div class=3D"gmail_default" sty=
le=3D"font-size:small">_fred._<a href=3D"http://tcp.example.com";>tcp.exampl=
e.com</a> SRV 0 100 80 <a href=3D"http://host2.example.com";>host2.example.c=
om</a><br></div><div class=3D"gmail_default" style=3D"font-size:small"><br>=
</div><div class=3D"gmail_default" style=3D"font-size:small">There is also =
a set of round robin A records for systems behind legacy NAT. You could do =
AAAA as well but these probably aren&#39;t needed as it is unlikely that a =
router blocking SRV will pass AAAA</div><div class=3D"gmail_default" style=
=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-s=
ize:small"><a href=3D"http://fred.example.com";>fred.example.com</a> A 10.0.=
0.1</div><div class=3D"gmail_default" style=3D"font-size:small"><a href=3D"=
http://fred.example.com";>fred.example.com</a> A 10.0.0.2<br></div><div clas=
s=3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail=
_default" style=3D"font-size:small">And finally, we have the host descripti=
on entries</div><div class=3D"gmail_default" style=3D"font-size:small"><br>=
</div><div class=3D"gmail_default" style=3D"font-size:small"><a href=3D"htt=
p://host1.example.com">host1.example.com</a> A 10.0.0.1<br></div><div class=
=3D"gmail_default" style=3D"font-size:small">_fred._<a href=3D"http://tcp.h=
ost1.example.com">tcp.host1.example.com</a> TXT &quot;minv=3D1.2 maxv=3D2 t=
ls=3D1.2 path=3D/fred12&quot;<br></div><div class=3D"gmail_default" style=
=3D"font-size:small"><a href=3D"http://host2.example.com";>host2.example.com=
</a> A 10.0.0.1<br></div><div class=3D"gmail_default" style=3D"font-size:sm=
all">_fred._<a href=3D"http://tcp.host2.example.com";>tcp.host2.example.com<=
/a> TXT &quot;tls=3D1.3&quot;<br></div><div class=3D"gmail_default" style=
=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-s=
ize:small">So here we have some host level service description tags which o=
bviously override the ones specified at the service level. With the proviso=
 that a client might well abort if the service level description suggests t=
here is no acceptable host. The path descriptor allows the use of the well =
known service to be avoided on host1. It defaults on host2</div><div class=
=3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_=
default" style=3D"font-size:small">In the normal run of things, a DNS serve=
r would recognize that a request for _fred._<a href=3D"http://tcp.example.c=
om">tcp.example.com</a> SRV was likely the start of a request chain and sen=
d all the records describing the service in a single bundle. This should us=
ually fit in a single UDP response.</div><div class=3D"gmail_default" style=
=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-s=
ize:small">This approach gives us two levers allowing us to set policy for =
the service. We can define policy for all service instances or granular per=
 host information.</div><div class=3D"gmail_default" style=3D"font-size:sma=
ll"><br></div><div class=3D"gmail_default" style=3D"font-size:small"><br></=
div><div class=3D"gmail_default" style=3D"font-size:small">The bit that I h=
ave not got nailed down is what the HTTP URL should be after the service di=
scovery is performed. My view is that they should be these:</div><div class=
=3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_=
default" style=3D"font-size:small"><a href=3D"http://host1.example.com/fred=
12">http://host1.example.com/fred12</a></div><div class=3D"gmail_default" s=
tyle=3D"font-size:small"><a href=3D"http://host2.example.com/.well-known/fr=
ed">http://host2.example.com/.well-known/fred</a><br></div><div class=3D"gm=
ail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_defaul=
t" style=3D"font-size:small">Which works nicely with the existing code and =
but not for TLS operations. We will either need certs for <a href=3D"http:/=
/host1.example.com">host1.example.com</a> and <a href=3D"http://host2.examp=
le.com">host2.example.com</a> or have to override the TLS stack to accept c=
erts for <a href=3D"http://example.com";>example.com</a>.</div><div class=3D=
"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_def=
ault" style=3D"font-size:small">The problem becomes even more apparent if t=
he redirects are to <a href=3D"http://host1.cloudly.com";>host1.cloudly.com<=
/a> and <a href=3D"http://host2.cloudly.com";>host2.cloudly.com</a> where cl=
oudly is a cloud service provider. So the alternative is to do this:</div><=
div class=3D"gmail_default" style=3D"font-size:small"><br></div><div class=
=3D"gmail_default" style=3D"font-size:small"><br></div><div class=3D"gmail_=
default" style=3D"font-size:small"><div class=3D"gmail_default"><a href=3D"=
http://example.com/fred12";>http://example.com/fred12</a></div><div class=3D=
"gmail_default"><a href=3D"http://example.com/.well-known/fred";>http://exam=
ple.com/.well-known/fred</a></div></div><div class=3D"gmail_default" style=
=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font-s=
ize:small">The problem is that it does not work well when trying to use thi=
s strategy with existing http clients built into scripting languages. Inste=
ad of just writing a module that does the SRV lookup and spits out the URLs=
 and attributes, now we need to rewrite our client so it will hit the right=
 DNS address.</div><div class=3D"gmail_default" style=3D"font-size:small"><=
br></div><div class=3D"gmail_default" style=3D"font-size:small"><br></div><=
div class=3D"gmail_default" style=3D"font-size:small">Given that most libra=
ries seem to have hooks to allow a client to make its own TLS certificate p=
ath math choices, I am very strongly in favor of the first approach. But I =
am willing to be persuaded otherwise.</div><div class=3D"gmail_default" sty=
le=3D"font-size:small"><br></div><div class=3D"gmail_default" style=3D"font=
-size:small">Comments?</div></div>

--001a1147010626c41e0541e517e6--

-- 
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: marka(_at_)isc(_dot_)org