perl-unicode

Re: perlunicode comment - when Unicode does not happen

2003-12-22 21:30:05
On Mon, 22 Dec 2003, Jarkko Hietaniemi wrote:

(AFAIK) W2K and later _are able_ to use UTF-16LE encoded Unicode for
filenames,
but because of backward compatibility reasons using 8-bit codepages is
much
more likely.

  No. _Both_ NTFS (only supported by Win 2k/XP) and VFAT (supported by
Win 2k/XP and Win 9x/ME) use UTF-16LE **exclusively**. In that respect,
Windows filesystems are 'saner' than Unix file systems.  APIs for accessing
them come in two flavors, 'A' APIs and 'W' APIs, though as I explained
in another message of mine.


The Apple HFS handles Unicode using _normalized_ (NFC, IIRC) UTF-8.

  The Mac OS X file system uses not NFC (precomposed unicode) but NFD
(decomposed Unicode).

There we have two different Unicode encodings, both in use.

 FYI, Mac OS X 10.3 (or 10.2) or later has APIs for the conversion
between NFC and NFD.

  Jungshik