Re: Don't use the \C escape in regexes

Am 04.05.2010 um 13:06 schrieb Michael Ludwig:

Is it this (theoretically fragile) implicitness in handling character strings 
that makes \C a bad idea?

But probably not as bad an idea as relying on the default platform encoding 
in Java ("default charset" in Java API doc lingo), which may be different 
from country to country and from installation to installation.

http://java.sun.com/javase/6/docs/api/java/lang/String.html#String%28byte[]%29


Or, more symmetrically to encoding via \C in Perl:

http://java.sun.com/javase/6/docs/api/java/lang/String.html#getBytes%28%29

  public byte[] getBytes()
    Encodes this String into a sequence of bytes
    using the platform's default charset, storing
    the result into a new byte array.

Much more serious and real than implicitly encoding via \C in Perl, given the 
fact that Java installations do not all use the same platform encoding, while 
all current Perl installations use the same internal encoding. (All Java 
installations use the same internal encoding of UTF-16, I think, but this fact 
is well hidden from the interface.)

-- 
Michael.Ludwig (#) XING.com

<Prev in Thread]

Current Thread

[Next in Thread>

Previous by Date:

Re: Don't use the \C escape in regexes - Why not?, Michael Ludwig

Next by Date:

Re: Don't use the \C escape in regexes - Why not?, Aristotle Pagaltzis

Previous by Thread:

Re: Don't use the \C escape in regexes - Why not?, Michael Ludwig

Next by Thread:

Re: Don't use the \C escape in regexes - Why not?, Aristotle Pagaltzis

Indexes:

[Date] [Thread] [Top] [All Lists]