perl-unicode

RE: UTF-16 -> UTF-8

2001-11-22 17:28:17
I assumed that you could write your data to a delimited file and have Perl
call on Access to import it using a VB macro. As far as I can tell, this
would work, no matter how the text is structured.

Since you prefer to write into the database file, the problem apparently
becomes one of translating UTF-8 to UTF-16 before insertion. That is a very
simple Perl script, and the conversion function is available in several
libraries. Why is this a problem?

If that isn't the answer, it means that I don't understand the question. In
that case, an example is in order, showing the input data, the transformed
data in UTF-8, and the data to be inserted into Access in UTF-16, plus any
other transformations you need to make along the way, and a clear
demonstration of whatever is wrong with the result.

-----Original Message-----
From: Rui Ribeiro [mailto:ruirib(_at_)computer(_dot_)org]
Sent: Thursday, November 22, 2001 5:29 AM
To: Edward Cherlin; perl-unicode(_at_)perl(_dot_)org
Subject: RE: UTF-16 -> UTF-8


Hi Edward,


You can tell Access about the encoding when you import a file.

In Access 2000, open the File menu, and on the Get External
Data submenu,
select Import. The file browser dialog box will open.

The problem with this is that I cannot simply import the
file, because I have structured text. The texts I'm working with are
medieval portuguese dictionaries, following a dictionary
structure, although a bit different one, and I need to recognize the
different parts of the structure. That's why I'm using Perl -
to parse the text.
In the end the limitation of writing to the database from
Perl in Unicode can be overcome if I don't encode the text files in
Unicode and use the codes the original transcribers used to
represent the required characters (for exemple, =i to represent an I
with a tilde). I can transform the codes to Unicode chars in
the database. I was looking for a more elegant solution, but
if I can't
find one I'll have to live with that.

Thanks for your suggestion, though.

Regards

Rui Ribeiro


<Prev in Thread] Current Thread [Next in Thread>