perl-unicode

Re: Info required - "Wide API calls" in Win32 Perl >= 5.8.2

2004-02-20 00:30:05

On Feb 20, 2004, at 1.16, Peter NESWAL wrote:

First, thanks to all on the fast response to my questions.

Jan Dubois wrote:

On Thu, 19 Feb 2004 22:03:14 +0200, Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> wrote:
After switching from perl 5.8.0 build 806 to perl 5.8.2 build 808
I found that the ability to invoke Win32 wide api calls was silently
removed (-C command line switch or ${^WIDE_SYSTEM_CALLS}=1).

Well, not *silently* removed.  It is mentioned in the change notes
(perldelta) of the Perl 5.8.1 standard distribution, I don't know
where exactly those notes are in the ActivePerls.

actually, when i stated "silently" i didn't think about documentation matters - even if it was a little bit hard to find - at least in ActivePerl's documentation - it was in "perlrun" doc about commandline switches, but I was not able - even afterwards - to find any discusions or infos on plans and reasons to do that in advance. Since I tested the wide api calls with 5.8.0, it to a lot of testing again with > 5.8.0 to realize that the Wide API interface was removed.

It is in perl581delta. It looks like it is still missing in the table of
content list, even though I thought I fixed that in build 809. :(
The removal was done as the suggestion of ActivePerl maintainers
since they considered the "Wide API" implementation (which they
themselves originally did) broken, and the -C was "recycled"
for other Unicodeish purposes. I am not familiar with the exact details
of what was broken with the -C as it was.
The -C option was implemented before Perl had proper Unicode support. The implementation of -C (the code is still there, it is just disabled) does *not* look at the UTF8 flag at all. It just assumes that the string passed
in is always in UTF8 unless "use bytes" was in effect.  It also stores
strings as UTF8 without setting the correct SV bits. Therefore it is not
compatible with the Unicode support in Perl.

To me there is no direct relation between the utf8 flag and the usage of wide api calls -> see long file/dir names issue. The only thing that changes is the type of encoding/decoding that is required before passing a value to the wide api call - but this should be also true if no wide api's are used.

The main point why I was so astonished about the remove/disabling of the wide api calls was the fact that I assumed that perl would silently move to a general use of the wide api interface (fixing existing buffer length issues) on all perl internal functions (environment, ...), at least if the utf8 flag was discovered - not to dump them (at least for the moment) at all.

This change however removes not only the possibility to use "UNICODE" names but also access to files and folders with names longer than 255 bytes.
Support of "long filenames" through the wide API was coincidental and not
consistent.  There are many places in win32/win32.c where buffers are
allocated as MAX_PATH or MAX_PATH+1 characters. If your filename passes through any of those routines, it would still be truncated even with -C.

Unfortunatly this problem is not to perl alone. Even most Microsoft OS (up to Windows Server 2003) software do have similar ploblems - we run in that kind of problem by using roaming user profiles on Win2K, WinXP workstations.

However this problem is far easier to solve for the file/dir interface than for example for registry entry- and key-names since there is a clear rule: (PathElement/PathLength) 255/255 Chars on ANSI, 255/32,767 WChars on UNICODE/Wide API calls.

Maybe somebody knows:
        a.) about a way to invoke the wide API interface
It is not possible because the USING_WIDE macro is hardcoded as 0 right now
(in win32.h).

Thanks to that info - after examine the 5.8.2 source I found this already but I didn't dig in that far to realize if this macro alone was controlling the "Wide API" call interface.

Take a look at the Changes5.8* files, look for "-C" or "wide", and then use the
http://public.activestate.com/cgi-bin/perlbrowse
for example
http://public.activestate.com/cgi-bin/perlbrowse?patch=18491

--
Jarkko Hietaniemi <jhi(_at_)iki(_dot_)fi> http://www.iki.fi/jhi/ "There is this special
biologist word we use for 'stable'.  It is 'dead'." -- Jack Cohen