ietf
[Top] [All Lists]

Re: Last Call: 'Domain Name System (DNS) Case Insensitivity Clarification' to Proposed Standard

2005-02-03 10:01:01


On Thursday, February 03, 2005 12:54:29 AM -0500 stanislav shalunov <shalunov(_at_)internet2(_dot_)edu> wrote:

Jeffrey Hutzelman <jhutz(_at_)cmu(_dot_)edu> writes:

[..T]he _common_ convention is to use a backslash followed by the
value of the octet as an unsigned integer represented by exactly
three _octal_ digits.  This is the syntax used by programming
languages like C and perl.  For example, ASCII ESC (0x1b) is
represented as \033, not \027.

Actually, the convention used in C and Perl is to use \0, followed by
zero, one, or two octal digits (leaving some values of octets without
representation).

I personally think it's a poor convention as it uses varying number of
digits, so it becomes difficult to represent, say, the NUL character
followed by the digit "1".  (I still use the convention in cases when
it is familiar to most from documentation, e.g., "\015\012" in Perl.)


ISO/IEC 9899:1990 section 6.1.3.4 has this to say:

   octal-escape-sequence:
     \ octal-digit
     \ octal-digit octal-digit
     \ octal-digit octal-digit octal-digit

   hexadecimal-escape-sequence:
     \x hexadecimal-digit
     hexadecimal-escape-sequence hexadecimal-digit

And...

   Each octal or hexadecimal escape sequence is the longest sequence
   of characters that can constitute the escape sequence.

So, octal escape sequences consist of one, two, or three digits, but never more than three, and the first digit is not required to be zero. Every 8-bit character can be represented, and every string of 8-bit characters can be represented. The sequence of the NUL character followed by the digit "1" can be written as "\0001", or as any of these:

 \0\061 \00\061 \000\061 \0\61 \00\61 \000\61

Interestingly, as you noted, hexadecimal sequences are _not_ length-limted.



Now, let's turn to Perl. The authoritative reference is from the section "Quotes and Quote-like Operators" in 'perldoc perlop':

      The following escape sequences are available in constructs that
      interpolate and in transliterations.


          \033        octal char      (ESC)
          \x1b        hex char        (ESC)
          \x{263a}    wide hex char   (SMILEY)

Unfortunately, this is a little vague, but an examination of the code shows that up to three digits are permitted in the octal case, and two in the hex case (wide hex chars have no preset length limit), which is pretty much what one would expect.



In any case, my original point was not to get into a discussion of the finer details of character escapes in particular programming languages, nor to suggest that every escape permitted by C or Perl be used in this context. Rather, it was to point out that the existing, commonly-used convention for character escapes of this form uses octal digits, not decimal, and that differing in this particular way would be likely to lead to confusion.


Mark Andrews wrote:
The C convention also has \t, \r, \f, \n none of which are
the special in domain labels.

This is a strawman. I did not make the argument that C string literal syntax should be used; I mentioned the syntax of C and Perl as examples of the widespread recognition of octal character escapes.

We are not trying to change conventions here.  It is irrelevent
that C has a different convention.

The C programming language has precisely defined syntax, not a convention. Again, I did not make the argument that C syntax should be used. What I did do was point out that there is a commonly-recognized convention for the meaning of a backslash followed by three digits, and that it differs from that described in this document, which to my knowledge does not enjoy any form of widespread acceptance. This fact is not "irrelevant"; it is likely to lead to significant confusion.

Note that I'm not demanding that the DNS use the same conventions as everyone else. I think they should, but I'm not going to insist on it. However, if you're going to be different, you need to call out this fact, and preferably explain why.

-- Jeffrey T. Hutzelman (N3NHS) <jhutz+(_at_)cmu(_dot_)edu>
  Sr. Research Systems Programmer
  School of Computer Science - Research Computing Facility
  Carnegie Mellon University - Pittsburgh, PA


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf