perl-unicode

Can from_to($s, SRC, TGT) leave chars missing in TGT unchanged?

2002-10-10 11:18:22
Encode::from_to($string,  SOURCE,  TARGET)  changes all characters which
are  missing in TARGET into '?' chars (ok, to be exact <subchar>s). This
is  probably  the  most  reasonable *default* behavior. But I could give
a  couple  of  arguments  why  other behavior (not to change those chars
missing  in  target encoding) is also reasonable and sometimes much more
reasonable.

My native language, Russian, suffers from having FIVE one-byte encodings
(windows-1251,  koi8-r, iso-8859-5, cp866, "MacCyrillic") which are used
everywhere  alternately  more   or   less   often.   Conversions  from 1
encoding to another  are  very often, and sometimes we just have to make
the reverse conversion.

MY   QUESTION   IS:  how  can I convert text from 1 one-byte encoding to
another without changing into '?' (leaving unchanged) characters missing
in target encoding?

I did try to find it out myself. At some point I thought that
from_to($string, SRCenc, TGTenc, ENCODE_LEAVE_SRC)
is  just  what  I  wanted,  because  it  LEAVEs  those chars in SRC that
ENCODE_NOREP...  but  unfortunately  no,  it  leaves  all  source string
untouched unconditionally.

Thanks in advance for any clues.

If  my  English and/or my question is far from clear, please tell me and
I'll do my best to rewrite it in other words.

-- 
Anton

<Prev in Thread] Current Thread [Next in Thread>