On Mon, Jun 16, 2003 at 02:37:23PM +0200, Brigitte Jellinek wrote:
hi!
i'm trying to use perl + dbi + dbd::mysql + mysql with unicode.
as far as i can tell i can write a utf8 string into the database,
and get back the same sequence of bits, only now it's a 'classical'
perl-string, not flagged as utf-8.
the string i write into the db is 6 characters long:
"ABc\N{greek:alpha}\x{00df}\N{cyrillic:e}"
character unicode utf8
hex binary
A 0041 01000001
B 0042 01000010
c 0063 01100011
greep alpha 03B1 11001110 10110001
german scharfes s 00DF 11000011 10011111
cyrrillic e 044D 11010001 10001101
what i get back from the db is
I've reformatted this slightly:
binary
A 01000001
B 01000010
c 01100011
11001110 10110001
11000011 00111111
11010001 00111111
The high bit has been lost from some of those bytes.
Probably need to solve that before worrying about flagging the
string as utf8 (for which Encode::_utf8_on(...) is okay).
Right now that'll 'work' but the utf8 bytes have been corrupted.
Perhaps the dbi-users mailing list would be a better place for this.
I'm sure others have been here before.
Tim.
p.s. Extending the DBI spec to cover uft8 is high on my to-do list.