perl-unicode

Re: perlunicode comment - when Unicode does not happen

2003-12-23 12:30:08
The point I'm trying to make (agreeing with most perl 5 porters I suspect)
is that supporting Shift-JIS in Perl5 is hopeless.

Curious.  I could have sworn people like Dan Kogai are pretty happy.
But I guess you refer to the Unicode <-> filename boundary.

made to work at least for core features like "-d". But, oops, it is a
dead-end since Perl doesn't do anything reasonable with UTF-8 when it makes
a system call.

Please stop generalizing like that-- I find it hard to concentrate on
your actual problems. Perl has no problems using UTF-8, and that _system calls_ don't usually quite what the _user_ expects is none of Perl's fault.

(There's still a ton of other stuff broken, but folks can get
around that later - let's fix the core now.)

Besides the Win32 CRT bug workaround you showed in win32.c I am not aware of anything in the _core_ that needs to be immediately fixed. Yes, in Win32 the 'W' variants should be somehow eventually used-- but how and when is not
yet clear to us, I think.

I'd suggest taking some code from ICU or Mozilla that tries to figure out
what the platform encoding is.

nl_langinfo(CODESET) is pretty much all that should be used on Unixish
platforms.  In Win32 the answer seems to be UTF-16LE-- *as long as* we
are on modern filesystems.

I don't think we understand common practice (or that such practices
are even established yet) well enough to specify that yet.

I may be misunderstanding your point, but I don't see "common practice"
bearing on this. UTF-8 in Perl is new - and currently it is dead in the
water for things like "-d" - so why not just fix it.

"Common practice" in "how do I detect what charset+encoding
the user now wants to use for their filenames" and "what funky charset+encoding
filesystems there are out there", I guess.

--
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen


<Prev in Thread] Current Thread [Next in Thread>