David Carlisle wrote:
...
having said that the example you gave was
TfxsbGVy
which is the base64 encoding of the latin-1 encoded characters, which is
rather strange. Do you really want to use latin1? How would you encode
characters above position 127 in that case?
.....
I just looked at
http://www.faqs.org/rfcs/rfc2849.html
and I think that your example is wrong, the spec uses a datatype of
BASE64-UTF8-STRING
so presumably the characters should be utf8 encoded before base 64
encoding (so u-umlaut would take two bytes in the string that is being
base64 encoded not one).
You are absolutely right.
My example was not taken from the original addressbook file (which is
encoded in UTF-8), I had it just written down and did not pay enough
attention.
I think I will attempt to base64 encode all text nodes using an external
(Perl) script first and then convert this intermediate file using my
(adapted) XSL stylesheet to LDIF. The LDIF file itself is just used to
be fed into the LDAP server and does not have to be readable.
Thanks for your answers.
Michael