On Wednesday, March 20, 2002, at 06:55 , Anton Tagunov wrote:
Hello, Dan!
Hello, Jarkko!
Hello, Nick!
I'm a bit confused with perl-unicode(_at_)perl(_dot_)org Is that a better place
for our conversation? Is it alive? Has any traffic?
perl-unicode(_at_)perl(_dot_)org is good for me. Its traffic is moderate (I
ought to subscribe to p5p and I once did but its traffic is too heavy on
me. I may do so in future but for the time being perl-unicode is the
place I use).
[snip]
So
"CES" === "coded character set"
"CCS" ne "coded character set"
"CES" ne "CCS"
When it comes to character handlings, we tend to be so frank about
terminologies but I think that's okay for me so long as we can tell the
difference. I try to be careful not to say 'char' to mean 'byte' but
even I, living in a multibyte world, fails sometimes....
As for CCS and CES, here is my implicit glossary.
byte = octet. 8 bytes
character = the smallest chunk of data that can be, ahem, supposed
to be,
handled by text editors
CCS = character set, often abbreviated as 'charset'
CES = character encoding or simply encoding.
But as you see, even MIME headers are confised here, as in
"Content-Type: text/plain; charset=iso-2022-jp". So we have to tweak
perl's motto here; "There are more than one way to say it" :)
And have some sleep :-)))
My best regards, Anton
I just need that, too. It is seven fifteen in JST. I am a nocturnal
creature but it has been long since the sun is up....
Dan the Man with Too Many Words to Define