Jacob Palme <jpalme(_at_)dsv(_dot_)su(_dot_)se> writes:
At 08.16 -0500 04-09-02, Vaudreuil, Greg M (Greg) wrote:
I assume you get predominately 8859-1 mislabled as US-ASCII based on the fact
you predominantely communicate with folks in a Western European language. I
bet someone using say Cyrilc would have a predominant error condidition of a
different 8859 varient mis-labled as US-ASCII.
Why would we assume the broken mailers would any more ensure that characters
are in 8859-1 than they would ensure that what is labled is correct? What
if the standard, or IETF in a BCP, says "treat this error as X" and a mailer
used in a different region made a different error?
My idea was to either have a setting for which charset
to assume in this case, or derive a setting from other
settings, such as the language or charset the person
has as default when sending mail.
Based on what I see that wouldn't help. My charset for sending mail
is iso-8859-15 (or these days UTF-8). But mis-labeled mail
used to be mostly chinese, with a recent trend for russian.
(Most, but not quite all, mis-labeled mail is SPAM.)
What my mail tool does at present wasn't deliberate but a
feature of way perl's Encode module works - given something
claiming to be ASCII but having high-bit chars I get them as
replacement characters - this flags that there is a problem
clearly enough!