[Top] [All Lists]

Re: UTF-8 and literal packets

2004-03-24 16:03:48

I would rather have the description in the RFC say what it means, rather than what to do with it. There are a number of reasons for that, chief among them that we don't have all the answers.

I'm working now on an OS with native support for umpteen character sets and encodings, one of which is UTF-8. There's a "Text Encoding" menu so I can select one should things not be displayed correctly. In my case, my MUA has the option of setting a property on the window as to what character set it is, and then it's all magically handled.

Let's also not forget that while OpenPGP gets used a whole lot for mail, it is an object security standard, not a mail security standard.

I can conceive of instances where whatever agent handles the text, it may know it's not UTF-8, but have no idea what it is, nor how to canonicalize. In that case, it should do the best job it can. I can also conceive of applications that may not have a good way to canonicalize, either. In any event, we are not wise enough to tell the implementer what to do.

What it means simply is that when you get such a literal packet, you know the text is in UTF-8. Do the right thing with said text. If you have a blob of text and you know it's not in UTF-8, then canonicalize, or use the 't' flag, or even use the 'b' flag.


<Prev in Thread] Current Thread [Next in Thread>