ietf-822
[Top] [All Lists]

Re: character sets

1991-05-08 22:27:16
[..lots of text deleted, here and there..]

# However, there are some things that we need to include in the header
# if we want the software to behave intelligently. For example, if we
# want our UA to automatically convert Quoted-Readable to, say, Latin-1,
# we will need something like:
# 
#       Content-Encoding: Quoted-Readable
# 
# so that strings such as "J&o/rn" can be converted. This can probably
# be made to work even if the string is in EBCDIC. However, if we use
# hex in the quoted encoding, ASCII strings that get converted into
# EBCDIC may lose their meaning. E.g. "J\F8rn": if this string is in
# EBCDIC and you convert the \F8 to one byte, you probably won't get the
# o-slash that was intended.
# 
# Of course, if you know that the original code was Latin-1, you could
# still do something intelligent with "\F8" even if it's in EBCDIC.
# Therefore, a hex encoding must be accompanied by an indication of the
# original codeset. For example:
# 
#       Content-Encoding: Quoted-Printable, Latin-1

I think Mark Crispin said in the prior message that it should be
specified by a resource-ref; something like..

        Content-Type: TEXT;Latin-1
        Content-Encoding: Quoted-Printable

So the UA can convert the quoted-printable text to 8 bit text,
and then convert the 8 bit Latin-1 to whatever the user wants.

# So what's my conclusion? If we allow people to send whatever they
# want, and we don't mark the text with a character encoding header, how
# can we interoperate?

I would like to see the character set information to be gotten from
setlocale(), or something similar.  So we have things like..

        Content-Type: TEXT;en_US.88591
        Content-Type: TEXT;ja_JP.sjis
        Content-Type: TEXT;ja_JP.ujis
        Content-Type: TEXT;ja_JP.10646

Then a Shift-JIS user can send Japanese mail to a UJIS user, and no
matter what Content-Encoding is used (Quoted-Printable or ISO2022),
the UA can do the code conversions correctly and the user can read
the mail.

Hitoshi Doi, International Systems Engineering    
doi(_at_)jrdmax(_dot_)jrd(_dot_)dec(_dot_)com
Japan Research and Development Center
Digital Equipment Corporation Japan

<Prev in Thread] Current Thread [Next in Thread>