Terminology; byte? char? charset? encoding?

On Wednesday, March 20, 2002, at 06:55 , Anton Tagunov wrote:

Hello, Dan!
Hello, Jarkko!
Hello, Nick!

I'm a bit confused with perl-unicode(_at_)perl(_dot_)org Is that a better place
for our conversation?  Is it alive? Has any traffic?

perl-unicode(_at_)perl(_dot_)org is good for me. Its traffic is moderate (Iought to subscribe to p5p and I once did but its traffic is too heavy onme. I may do so in future but for the time being perl-unicode is theplace I use).

[snip]
   So
     "CES" === "coded character set"
     "CCS"  ne "coded character set"
     "CES"  ne "CCS"

When it comes to character handlings, we tend to be so frank aboutterminologies but I think that's okay for me so long as we can tell thedifference. I try to be careful not to say 'char' to mean 'byte' buteven I, living in a multibyte world, fails sometimes....

  As for CCS and CES, here is my implicit glossary.

byte                    = octet.  8 bytes

character = the smallest chunk of data that can be, ahem, supposedto be,

                        handled by text editors
CCS                     = character set, often abbreviated as 'charset'
CES                     = character encoding or simply encoding.

But as you see, even MIME headers are confised here, as in"Content-Type: text/plain; charset=iso-2022-jp". So we have to tweakperl's motto here; "There are more than one way to say it" :)


And have some sleep :-)))

My best regards, Anton

I just need that, too. It is seven fifteen in JST. I am a nocturnalcreature but it has been long since the sun is up....


Dan the Man with Too Many Words to Define