On Jun 16, 2010, at 12:04 AM, Henning Michael Møller Just wrote:
Hello (loved your PostgreSQL presentation at the most recent OSCON, BTW)
Thanks. Come see my tutorial at OSCON this year, if you can: Test-Driven
Database Development. :-) Not sure I can make a tutorial as entertaining, alas.
Perhaps if I bring beer for the audience.
Which editor do you use? When loading the script in Komodo IDE 5.2 the string
looks broken. Running the script (ActivePerl 5.10.1 on Windows) only the
second line is correct - the first (no surprise) and third are broken.
Yes, that's how it looks to me in GNU Emacs (compiled from source with cocoa
bindings).
Loading the file in UltraEdit-32 13.20+3, set to not convert the script on
loading, it becomes obvious that what should have been one character is
represented by 4 bytes, \xC3 \x84 \xC2 \x8D, which modern editors would
probably show as 2 characters and as broken.
Right.
It looks to me like the string is being displayed as a byte representation of
the characters, if that makes sense. My english isn't perfect :-/ and what I
am trying to say is that this is problem that I am quite familiar with. It
happens whenever the source and the reader do not agree on whether a string
is encoded in utf-8 or not.
Apparently Encode fixes the incorrect string which is nice. The interesting
thing is, where should this be fixed? If it's at Yahoo! Pipes you'll probably
have to use Encode as a work-around for some time...
Yes.
Best,
David