perl-unicode

Re: Change 16302: Provide the \N{U+HHHH} syntax before we forget.

2002-05-02 19:39:24

On Fri, 3 May 2002 02:30:11 +0300
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> wrote:

On Thu, May 02, 2002 at 08:01:34AM +0200, Philip Newton wrote:
On Wed, 1 May 2002 07:00:05 -0700, jhi(_at_)iki(_dot_)fi (Jarkko 
Hietaniemi) wrote:

Change 16302 by jhi(_at_)alpha on 2002/05/01 12:54:24

  Provide the \N{U+HHHH} syntax before we forget.

Do we also want to support U-HHHHHH? I seem to recall from somewhere

Hmmm.  One always learns something new... where did you find that format?

that U+HHHH went to U+FFFF and that code points beyond that were
U-HHHHHHHH (i.e. U+ form took 4 hex chars and U- form took 8 hex chars,
or something like that.)


U-HHHHHHHH format is mentioned in Preface, 0.2 Notational Convention,
in Unicode 3.0.

http://www.unicode.org/uni2book/Preface.pdf
http://www.unicode.org/uni2book/u2.html

But Unicode 3.1 extends U+HHHH notation beyond 0xFFFF.
cf. http://www.unicode.org/unicode/reports/tr27/

Citation from here
   II Notational Changes for the Standard
      Section 0.2 Notational Conventions, page xxviii:
      change the description of the U+ notation to read:

      In running text, an individual Unicode code point
      can be expressed as U+n, where n is from four to six
      hexadecimal digits, using the digits 0-9 and A-F
      (for 10 through 15, respectively).
      There should be no leading zeros, unless the codepoint
      would have fewer than four hexadecimal digits;
      for example,
        U+0001, U+0012, U+0123, U+1234, U+12345, U+102345.
End of citation

Therefore U-0001FFFF is U+1FFFF and U-0010FFFF is U+10FFFF.

Regards,
SADAHIRO Tomoyuki

<Prev in Thread] Current Thread [Next in Thread>