perl-unicode

RE: DBD:mysql and UNICODE

2001-08-02 08:02:53
Just so I understand. . .and I think I understood UNICODE BEFORE I started 
reading all the literature that seemed to confuse the matter. :)

UNICODE is a character encoding that can handle any character irrespective of 
language
When I output to the web I will need to convert UNICODE to some appropriate 
character-set based upon the language selection. 

Is this correct?  Or can this be done automatically. . .or at least, can I just 
avoid it and send the UNICODE data directly to a web-browser and let the 
browser do whatever is necessary.  As I intend to develop a system that can 
handle an arbitrary number of languages, I want let the code handle any 
language without me necessarily having to add more and more code to support it 
-- I would love it if I could just choose one flavor -- UNICODE -- and that be 
it.  But hey, I know I do not live in an ideal world. . .  ;)

I do appreciate your help.

Thanks,
Ward

-----Original Message-----
From: Andrew McNaughton [mailto:andrew(_at_)aniwa(_dot_)wallace(_dot_)lan]
Sent: Wednesday, August 01, 2001 9:27 PM
To: Vuillemot, Ward W
Subject: Re: DBD:mysql and UNICODE




On Wed, 1 Aug 2001, Vuillemot, Ward W wrote:

Date: Wed, 1 Aug 2001 15:57:16 -0700
From: "Vuillemot, Ward W" 
<Ward(_dot_)Vuillemot(_at_)PSS(_dot_)Boeing(_dot_)com>
To: "'perl-unicode(_at_)perl(_dot_)org'" <perl-unicode(_at_)perl(_dot_)org>
Subject: DBD:mysql and UNICODE

I am looking to develop a set of databases that can handle
international character sets.  For example, I want to have menu items
that can be changed on the fly from, say, English to Japanese to
German to Chinese.

Should I create a table that correlates each language with a UNICODE
set?  And then create a table where each row is for a specific
language and the columns being the individual entries?  After that,
can I use a lookup into the first table based on the key of the second
table to determine what type of UNICODE character-set it is.  (sorry,
I am typing out load as it were ;) ).

Your character set in the database *is* unicode.  There's only one unicode
character set.  All other common to medium-rare character sets are subsets
of that one big set.  Keep things simple and store nothing in your
database that's not in unicode.

You could store your strings as you say, but I'd be inclined to have every
string in its own row, and have a column which identifies the language.

For a given language (eg english), there might be multiple possible
character encodings (eg iso-8859-1, cp1252, utf-8), and you might choose
to support more than one in your web output.  You might store
language/character encoding combinations in your database, but character
encoding and character set are not to be confused.

<Prev in Thread] Current Thread [Next in Thread>