ietf
[Top] [All Lists]

RE: Last Call: draft-klensin-net-utf8 (Unicode Format for Network Interchange) to Proposed Standard

2008-01-15 04:10:17

      (1) We have a collection of protocols in the IETF that
      use text in lines and data transmission models that are
      both very simple and very dependent on a clear and
      precise definition of "line".  That definition has
      traditionally involved a CRLF sequence and that alone.

Now the net-utf8 work is targeted exclusively at the first case.
Even more specifically, it has been clear since the request 
to get this written up and standardized came out of an 
Applications Area meeting a few years ago that it would need 
be designed so that all of the sensible and plausible uses of 
"NVT" would be
valid for it, without changes.

In that case, this is *NOT* truly net-utf8 as one would understand
it in normal English. Instead it is the Unicode version of NVT.

It seems to me that there is room for a standard Net-UTF8
for future new protocols, that sticks closely to the Unicode
standards and tries to be transparent to arbitrary UTF-8
streams. This newer UTF8 standard would take its line-ending
cues from the Unicode regular expression rules, i.e. a 
command line ending could potentially be detected using
a Unicode compatible RE engine. Unicode Whitespace can be used
to delimit words and arguments in command lines. One might even
leverage the existence of PS to mark the end of the command lines
paragraph, for instance an SMTP like protocol would not need a DATA
keyword because a PS could be used to mark the beginning of
data.

I think it is a good idea to have an update to NVT that allows 
for Unicode, but I don't think that the need for reverse
compatibility kludges which make it incompatible with the
Unicode standard require this document to not masquerade as the
definitive format for Unicode on the wire.

If the nomenclature is changed a bit to make it clear that
this is an update of NVT, then I think many of the arguments
against it fall away. Of course this means that there will
be another document at some point with a different form of
UTF8 on the wire, but it doesn't hurt to give future protocol
designers a clear choice.

Is it better to stick with the installed base and make incremental
improvements, or should we break with the past and start afresh
using the hard-earned knowledge of building that installed base?
This draft should not have to wait for that decision to be made
rather it should go to RFC status sooner, rather than later, so
that attention can be focussed on the second option, and what 
comes after that.

--Michael Dillon


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf

<Prev in Thread] Current Thread [Next in Thread>