Re: Converting ISO 8859-2 characters

On Fri, 19 Jan 2001 19:02:53 -0800
   Earl Hood <ehood(_at_)hydra(_dot_)acs(_dot_)uci(_dot_)edu> wrote:

On January 19, 2001 at 22:30, "Peter Seitz" wrote:

<CharsetConverters>
plain;          mhonarc::htmlize;
us-ascii;       mhonarc::htmlize;
iso-8859-1;     mhonarc::htmlize;
iso-8859-2;     iso_8859::str2sgml;     iso8859-ref.pl
iso-8859-3;     iso_8859::str2sgml;     iso8859.pl
iso-8859-4;     iso_8859::str2sgml;     iso8859.pl
iso-8859-5;     iso_8859::str2sgml;     iso8859.pl
iso-8859-6;     iso_8859::str2sgml;     iso8859.pl
iso-8859-7;     iso_8859::str2sgml;     iso8859.pl
iso-8859-8;     iso_8859::str2sgml;     iso8859.pl
iso-8859-9;     iso_8859::str2sgml;     iso8859.pl
iso-8859-10;    iso_8859::str2sgml;     iso8859.pl
default;        -ignore-
</CharsetConverters>

But the result is not what I've thought. The characters still are
converted using the entities. I am stuck here.


Use a different package/function name.  What is happening is that
which ever library is read last, iso8859.pl or iso8859-ref.pl,
the last library's function definition will override the other.

Change iso8859-ref.pl to use a package name of "iso_8859_ref",
and register it as:

iso-8859-2;     iso_8859_ref::str2sgml;     iso8859-ref.pl



I've done as you've told, but I still have problems. I don't know if
it's a bug or a feature.

Implementing the CharsetConverters as you've told above only converts
the headers of the mail correctly with the references:

To: "=?ISO-8859-2?Q?nov=FD =E8len?=" 
<test(_at_)fbzslinux(_dot_)tu-graz(_dot_)ac(_dot_)at>

gets converted to:

<LI><em>To</em>: "nov&#253; &#269;len" &lt;<A 
HREF="mailto:test(_at_)fbzslinux(_dot_)tu%2Dgraz(_dot_)ac(_dot_)at">test(_at_)fbzslinux(_dot_)tu-graz(_dot_)ac(_dot_)at</A>&gt;</LI>

But the message body still uses the entities defined in the iso8859.pl
file.

I've fiddled a little around with the perl files from the
distribution. Changing the reference for the converter in the
mhinit.pl file works like I've expected:

##  Charset filters
##
%readmail::MIMECharSetConverters = (
    # Character set         Converter Function
    #-------------------------------------------------------------------
    "plain",                "mhonarc::htmlize",
    "us-ascii",             "mhonarc::htmlize",
    "iso-8859-1",               "mhonarc::htmlize",
#    "iso-8859-2",               "iso_8859::str2sgml",
    "iso-8859-2",               "iso_8859_ref::str2sgml",
    "iso-8859-3",               "iso_8859::str2sgml",
# [...]
    "default",              "-ignore-",
);
%readmail::MIMECharSetConvertersSrc = (
    # Character set         Converter Function
    #-------------------------------------------------------------------
    "plain",                undef,
    "us-ascii",             undef,
    "iso-8859-1",               undef,
#    "iso-8859-2",               "iso8859.pl",
    "iso-8859-2",               "iso8859-ref.pl",
# [...]
    "default",              undef,
);

With this setting, also the mail body gets translated correctly.

This brings me to the assumption that the definitions from the
resourcefile is not considered when converting the message body.

I was not able to find out where the settings from the resourcefile
are used when converting because I am only a perl learner. So I guess
I have to wait for Earl to commend on this issue.

Thanks in advance


With best compliments

           Peter Seitz
--

  Graz University of Technology, Austria - Fac. f. Civil Engineering
  mailto:seitz(_at_)bzs(_dot_)tu-graz(_dot_)ac(_dot_)at - 
http://wwwbzs.tu-graz.ac.at/~seitz/

            Member of the Pegasus Mail Support Group
          Coordinator of the Pmail Translation Process

For information about translating Pegasus Mail, contact:
Han van den Bogaerde or Peter Seitz at
translation-coordinator(_at_)pmail(_dot_)gen(_dot_)nz