perl-unicode

Re: good name for characters matching [^\0-\377]?

2007-10-18 16:19:26
John Delacour skribis 2007-10-18 20:24 (+0100):
They are "characters outside the latin-1 range".
Latin-1 has nothing to do with it.

Blocks of characters have names in Unicode. One of those names is
"Latin-1 Supplement".

It has a lot to do with it.

However, I was mistaken: "latin-1" in Unicode is U+0080..U+00FF,
thus excluding ASCII part, which is called "Basic Latin" in Unicode.

There are countless legacy character sets that use the code points
from 32 to 255, and besides, what maquerades as Latin-1 in various
environments rarely is strict iso-8859-1

The latin-1 here is not an alias for iso-8859-1, though I do wish to
point out that iso-8859-1 was redefined as a Unicode encoding in 1997.
That is, byte 0x80 is defined as the Unicode character U+0080.
-- 
Met vriendelijke groet,  Kind regards,  Korajn salutojn,

  Juerd Waalboer:  Perl hacker  <#####(_at_)juerd(_dot_)nl>  
<http://juerd.nl/sig>
  Convolution:     ICT solutions and consultancy 
<sales(_at_)convolution(_dot_)nl>