Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;

I can explain that.  "\x{3af}bc\x{3af}de" is is a string literal so 
it gets encoded.  however, my example in escaped form is;

  $kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/

  which does not get encoded.  the intention was;

  $kana =~ tr/\x{3041}-\x{3093}/\x{30a1}-\x{30f3}/

  That's why

  eval qq{ $kana =~ tr/\xA4\xA1-\xA4\xF3/\xA5\xA1-\xA5\xF3/ }

works because \xA4\xA1-\xA4\xF3 and \xA5\xA1-\xA5\xF3 are converted. 
to \x{3041}-\x{3093} and \x{30a1}-\x{30f3}, respectively.


I'm confused.  Firstly, the tr/\xA4... converts bytes thusly:

  A1 -> A1
  A2 -> A2
  A3 -> A3
  A4 -> A5
  A5 -> A5
  F3 -> A5

So why isn't it just tr/\xA4\xF3/\xA5/?

Secondly, aren't you expecting tr/// to magically recognize that when
the EUC-JP codes \xA4, \xA1 to \xA4, and \xF3 are converted to their
Unicode counterparts they are supposed to spell out the Hiragana range?
The "range" concept of tr/// is very limited.  I think you want s///e.

-- 
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> http://www.iki.fi/jhi/ "There is this 
special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;, Jarkko Hietaniemi

Next by Date:

Re: Parsing JIS X 0208 & Shift JIS with 5.8.0 +++++Success, Robin

Previous by Thread:

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;, Dan Kogai

Next by Thread:

Re: [FYI] use encoding 'non-utf8-encoding'; use CGI;, Jarkko Hietaniemi

Indexes:

[Date] [Thread] [Top] [All Lists]