[Top] [All Lists]

Re: Clearsigning, MIME, etc.

2002-04-17 02:35:11

On 2002-04-16 18:05:51 -0700, Jon Callas wrote:

Clearsigning is not armoring. They are different. If I hear armoring, and you talk about clearsigning, we're going to talk at cross purposes. I understand your comments on clearsigning, but lets's take them one step at a time.

I'm sorry for the confusion. With armoring proper (everything except clearsigning) the only problem is that an implementation needs to know, somehow, what character set the cleartext is in.

That can easily be solved by making the Charset header mandatory, and specifying that text without a charset header is to be assumed utf-8 under all circumstances. I'll cover that in another message.

The only question in this context is if the charset information shouldn't be part of the text packet, i.e., inside the cryptographical envelope. I suppose that would be cleaner; also, it would solve tagging when ASCII armor is not used. Alternatively, get rid of the Charset header and make utf-8 mandatory for text.

But back to clearsigning.

[Note: There are unicode characters below.]

... which show up as question marks, because I'm not (yet ;-) operating in a full utf-8 environment - in fact, most things are just running in a iso-8859-15 locale, where the characters you used just can't be displayed. But that's actually a fine example: Just cutting & pasting what's displayed (which may be needed with some systems) would be impossible in this case, because a lossy encoding had to take place.

??? Your message displays properly under Entourage X on my Mac. Yay! It also displays correctly with, which ships with OSX. It does not display correctly under Eudora for OSX. Boo (?). If I take said message, paste it into BBEdit, it displays correctly. Yay! If I save out said message in UTF8 with DOS line ends (another ? for BBEdit), it verifies correctly with GPG 1.06. Another yay!

That's not so unexpected, isn't it? After all, no lossy re-encodings were involved, and you ultimately saved the message in its original character set.

I have a mail client and text editor that both displays UTF-8 and verifies your clearsigned message. It even works in spite of the fact that your message has trailing whitespace on its lines. So this *is* possible to do! Just get a Mac. Failing that, convince some MUA developers to do the right thing on Windows and Linux.

Congratulations. That was the easy part. I suppose we agree that I did things the right way with my message?

In this case, I'd suggest you tell this to vedaal who used PGP 6.5.8 [ckt] with outlook like many, many others do.

In order to verify the signature under the message he sent to the list yesterday, you'll have to second-guess that his computer uses cp-1252 as the local character set. You then have to explicitly recode the text to that character set in order to get the signature verified.

In order to verify a signature I make, I suppose he'd have to re-encode the data as presented to him from cp-1252 to utf-8. (He consistently reported that he could not verify the signatures I made.)

Another member of this list who uses Lotus Notes 4.5 on NT4 also reported that the signature verification failed, but the message displayed (mostly) correctly. Of course, he's quite close to MUA hell, so in that case I suppose verifying _some_ signatures already counts as a success. (Thanks for your help!)

This exercise I went through proves my point, to my mind. With a clearsigned message, I can see the intended characters as well verify the message's signature. In short, the system works.

The system works as long as everyone and everything involved uses the same charset, which is the case in our example. I never disputed that. The problem is that the system breaks as soon as _different_ character set worlds are involved.

It breaks "softly" in the windows case, where things become unusable for the average user, and cumbersome for someone who kind of knows what he is doing.

It breaks the hard way as soon as lossy recodings are involved, since these will ruin signatures.

More to the point, if the message were not clearsigned, if it were MIMEd, I would be unable to easily go to the trouble of verifying the message using a text editor and GPG. I would have had to pry open MIME parts,


construct OpenPGP headers,

Eh?  You don't need to construct any OpenPGP headers with PGP/MIME.

Nonetheless, I'd love to hear what you have to say about 8859-1 and 8859-15. I've gone and looked up 8859-15 (which I'd never heard of before), and would like to hear your insights.

iso-8859-15 is the replacement for iso-8859-1 with the Euro sign added (instead of the currency symbol), and with some characters such as "LATIN {CAPITAL,SMALL} LETTER S WITH CARON", "LATIN {CAPITAL,SMALL} LETTER Z WITH CARON" added instead of "ACUTE ACCENT", "CEDILLA", "BROKEN BAR", "DIAERESIS". (I think there are some more changes, but these are the ones I immediately recall.)

Obviously, the transformation between the two can't be guaranteed to be loss-free. That is, as soon as you (have to) recode between these, a signature may be broken.

To wrap things up:

- ASCII armor proper can be fixed by giving a clear specification of the character set issues involved: Either mandate UTF-8, or mandate tagging and use UTF-8 as the default. The current language is considerably too fuzzy, and - I believe - mostly ignored.

- With clearsigning, the "soft" failures (I suppose these are the more common ones) can be avoided by either mandating that the signed material is to be recoded from the local representation (which must be known out-of-band) to utf-8 before signing and verifying, or by mandating that the character set used when generating the signature is indicated in an appropriate tag.

The problem with tagging is that implementations are encouraged to use proprietary character sets. Probably, a note about being conservatives about the character sets used should be added.

The problem with getting anything implemented is that NAI does not support PGP any more.

Finally, hard failures of clearsigning: You can only avoid these by making sure that no lossy recoding happens as the data travels from signer to verifier. Encouraging people to use utf-8 on the wire (so there is at least no lossy recoding on the sending side) may help, but you won't get rid of all the problems that way.

Note that both kinds of clearsigning failures don't occur with PGP/MIME: The signed material is invariant under the transformations which can reasonably be expected to happen.

Thomas Roessler                        <roessler(_at_)does-not-exist(_dot_)org>

<Prev in Thread] Current Thread [Next in Thread>