Nadim <nadim(_at_)khemir(_dot_)net> writes:
On Sunday 13 October 2002 14:45, Nick Ing-Simmons wrote:
I am using 5.6.3 on windows from activestate. I do the
following.
I don't think you are. As far as I am aware there is only perl5.6.1
there isn't a .3 subversion yet.
Sorry for the typo 5.6.1.63
my $ole_object = ..... ;
my $unicode_string = $ole_object->GetUnicodeString() ;
OLE objects are a Win32 thing. You would be better off asking on
one of the Win32 aware ActiveState lists. We would at least need
to know how you created $ole_object so we can lookup the code
that gets the string.
I wrote the OLE object, The string it seends back is a unicode string. I
call other functin on the object and they behave right.
Ok so as it is your code you can make it do the right thing - which may
not be what you think at first.
Can you share with us the C code frament that returns the string to
perl as a "scalar value" (SV). If you are not doing that (but leaving it to
Win32::OLE) then can you give the "signature" of the ->GetUnicodeString
method that it is wrapping?
I string in perl has a "PV" pointer value which is a sequence of
bytes (octets). In perl5.6 and later perl can be told to interpret
them in one of two ways:
1. Like all previous perls as 1-byte/char with same repertoire
as iso-8859-1.
2. As UTF-8 representing Unicode (some mainframes use UTF-EBCDIC
but that is not an issue here).
So to return "Unicode" to perl you must use form (2). That is the
Unicode codepoints must be UTF-8 encoded, and you must call SvUTF8_on(sv)
on the sv.
This is different from Win32's normal treatment of Unicode - which is
to use 16-bit "wide characters" from the UCS-2 repertoire of Unicode
(I have been told that Win2k and later use UTF-16 to give the full
Unicode repertoire at the expense of using surrogate pairs).
->GetUnicodeString has converted things it does not understand to '?'.
GetUnicodeString doesn't convert anything, did you mean perl converted
things it didn't understand?
print $unicode_string ;
# prints ??????????????? on the console
Hmm - as perl5.6 does not have "smart" Unicode IO (perl5.8 does),
this suggests that string is actually '?' x 17 - i.e. you got "junk"
back from the OLE call.
Don't think so, THe ole object behaves correctly (I test it froma C++ app)
now Win32::Ole is also involved.
It is the Win32::Ole that _may_ be doing (or not doing) the conversion.
Which version are you using?
2/ read a unicode string from a file
For perl5.6 file has to be in UTF-8 and you need to do some hackery
(which was so horrible I can't recall it).
Did you see the hakery in this mailing list?
Possibly - a _long_ time ago when perl5.6 was being developed, more likely
on the perl5-porters(_at_)perl(_dot_)org list - none the less there are perl5.6
users on this list that no doubt still use it.
--
Nick Ing-Simmons
http://www.ni-s.u-net.com/