|
Re: UTF-8 and literal packets
2004-03-24 16:03:48
I would rather have the description in the RFC say what it means,
rather than what to do with it. There are a number of reasons for that,
chief among them that we don't have all the answers.
I'm working now on an OS with native support for umpteen character sets
and encodings, one of which is UTF-8. There's a "Text Encoding" menu so
I can select one should things not be displayed correctly. In my case,
my MUA has the option of setting a property on the window as to what
character set it is, and then it's all magically handled.
Let's also not forget that while OpenPGP gets used a whole lot for
mail, it is an object security standard, not a mail security standard.
I can conceive of instances where whatever agent handles the text, it
may know it's not UTF-8, but have no idea what it is, nor how to
canonicalize. In that case, it should do the best job it can. I can
also conceive of applications that may not have a good way to
canonicalize, either. In any event, we are not wise enough to tell the
implementer what to do.
What it means simply is that when you get such a literal packet, you
know the text is in UTF-8. Do the right thing with said text. If you
have a blob of text and you know it's not in UTF-8, then canonicalize,
or use the 't' flag, or even use the 'b' flag.
Jon
|
|