perl-unicode

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

2002-10-02 07:30:09
I can explain that.  "\x{3af}bc\x{3af}de" is is a string literal so 
it gets encoded.  however, my example in escaped form is;

  $kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/

  which does not get encoded.  the intention was;

  $kana =~ tr/\x{3041}-\x{3093}/\x{30a1}-\x{30f3}/

  That's why

  eval qq{ $kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/ }

works because \xA4\xA1-\xA4\xF3 and \xA5\xA1-\xA5\xF3 are converted. 
to \x{3041}-\x{3093} and \x{30a1}-\x{30f3}, respectively.

I'm confused.  Firstly, the tr/\xA4... converts bytes thusly:

  A1 -> A1
  A2 -> A2
  A3 -> A3
  A4 -> A5
  A5 -> A5
  F3 -> A5

So why isn't it just tr/\xA4\xF3/\xA5/?

Secondly, aren't you expecting tr/// to magically recognize that when
the EUC-JP codes \xA4, \xA1 to \xA4, and \xF3 are converted to their
Unicode counterparts they are supposed to spell out the Hiragana range?
The "range" concept of tr/// is very limited.  I think you want s///e.

-- 
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> http://www.iki.fi/jhi/ "There is this 
special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen