Re: [Encode] Compound Unicode Character Support in UCM

Dan Kogai <dankogai(_at_)dan(_dot_)co(_dot_)jp> writes:


I don't like the <UNNNN+UMMMM> part it will make the parsing messier.

The \xYY\xYY is of course what I meant ;-)


Not that much.  It's just a regex after all.


For _perl_ it is but if we are going to get IBM's ICU or others 
to back-port it then it is better to keep things clean.

So let us have yacc-like:

from : codepoint 
     | from codepoint
     ;

codepoint : '<' 'U' hexdigits '>'
          ;

to   : octet
     | to octet
     ;

octet : '\\' 'x' hexdigits 
      ;

Let's TIMTOWTDI it.

  <U...><U...> has already been working.  <U...+U...> soon to come.

Dan the Encode Maintainer

-- 
Nick Ing-Simmons
http://www.ni-s.u-net.com/

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: [Encode] MacIceland(ic)?, once again., Dan Kogai

Next by Date:

Re: [Encode] enc2txt missing under perl-current/utils/, Nick Ing-Simmons

Previous by Thread:

[Encode] Compound Unicode Character Support in UCM, Dan Kogai

Next by Thread:

Re: Encode seriously broken, Dan Kogai

Indexes:

[Date] [Thread] [Top] [All Lists]