[Top] [All Lists]

Re: [openpgp] Character encodings

2015-03-17 14:01:42
This would be a huge step backward. The proportion of text on the internet
that is UTF-8 is monotonically increasing toward 100%. Thank goodness.
On Mar 18, 2015 4:38 AM, "Wyllys Ingersoll" <wyllys(_at_)gmail(_dot_)com> wrote:

One area that I think needs some attention is the character encoding and
charsets for encrypted text messages.

4880 says that everything should be UTF-8.  However, the reality is that
UTF8 is not used everywhere and there are lots of clients that compose
messages in their native preferred character set (Latin5, Greek, Kanji,
etc) and its very difficult as an implementor to figure it out after the
fact without some indication from the sender.

The literal packet format only specifies 3 possible values - binary, UTF8,
or plain.  The ASCII Armor header may specify a different charset (though
unfortunately very few agents add the "Charset" PGP header).
Additionally, if the message had MIME headers, there may be yet another
charset indicated in MIME that differs from the ASCII Armor charset and the
literal packet data format byte.

If the encrypting PGP software knows what character encoding was used to
compose the original message, there should be some way to communicate this
in the message that would be definitive so that the decrypting software can
present it the way it was originally intended.  As an implementor, this is
one of the trickiest areas to get right so that the end user sees the
messages as it was originally intended.

openpgp mailing list

openpgp mailing list
<Prev in Thread] Current Thread [Next in Thread>