Re: Clearsigning, MIME, etc.
2002-04-17 02:35:11
On 2002-04-16 18:05:51 -0700, Jon Callas wrote:
Clearsigning is not armoring. They are different. If I hear
armoring, and you talk about clearsigning, we're going to talk at
cross purposes. I understand your comments on clearsigning, but
lets's take them one step at a time.
I'm sorry for the confusion. With armoring proper (everything
except clearsigning) the only problem is that an implementation
needs to know, somehow, what character set the cleartext is in.
That can easily be solved by making the Charset header mandatory,
and specifying that text without a charset header is to be assumed
utf-8 under all circumstances. I'll cover that in another message.
The only question in this context is if the charset information
shouldn't be part of the text packet, i.e., inside the
cryptographical envelope. I suppose that would be cleaner; also, it
would solve tagging when ASCII armor is not used. Alternatively,
get rid of the Charset header and make utf-8 mandatory for text.
But back to clearsigning.
[Note: There are unicode characters below.]
... which show up as question marks, because I'm not (yet ;-)
operating in a full utf-8 environment - in fact, most things are
just running in a iso-8859-15 locale, where the characters you used
just can't be displayed. But that's actually a fine example: Just
cutting & pasting what's displayed (which may be needed with some
systems) would be impossible in this case, because a lossy encoding
had to take place.
??? Your message displays properly under Entourage X on my Mac.
Yay! It also displays correctly with Mail.app, which ships with
OSX. It does not display correctly under Eudora for OSX. Boo (?).
If I take said message, paste it into BBEdit, it displays
correctly. Yay! If I save out said message in UTF8 with DOS line
ends (another ? for BBEdit), it verifies correctly with GPG 1.06.
Another yay!
That's not so unexpected, isn't it? After all, no lossy
re-encodings were involved, and you ultimately saved the message in
its original character set.
I have a mail client and text editor that both displays UTF-8 and
verifies your clearsigned message. It even works in spite of the
fact that your message has trailing whitespace on its lines. So
this *is* possible to do! Just get a Mac. Failing that, convince
some MUA developers to do the right thing on Windows and Linux.
Congratulations. That was the easy part. I suppose we agree that I
did things the right way with my message?
In this case, I'd suggest you tell this to vedaal who used PGP 6.5.8
[ckt] with outlook like many, many others do.
In order to verify the signature under the message he sent to the
list yesterday, you'll have to second-guess that his computer uses
cp-1252 as the local character set. You then have to explicitly
recode the text to that character set in order to get the signature
verified.
In order to verify a signature I make, I suppose he'd have to
re-encode the data as presented to him from cp-1252 to utf-8. (He
consistently reported that he could not verify the signatures I
made.)
Another member of this list who uses Lotus Notes 4.5 on NT4 also
reported that the signature verification failed, but the message
displayed (mostly) correctly. Of course, he's quite close to MUA
hell, so in that case I suppose verifying _some_ signatures already
counts as a success. (Thanks for your help!)
This exercise I went through proves my point, to my mind. With a
clearsigned message, I can see the intended characters as well
verify the message's signature. In short, the system works.
The system works as long as everyone and everything involved uses
the same charset, which is the case in our example. I never
disputed that.
The problem is that the system breaks as soon as _different_
character set worlds are involved.
It breaks "softly" in the windows case, where things become unusable
for the average user, and cumbersome for someone who kind of knows
what he is doing.
It breaks the hard way as soon as lossy recodings are involved,
since these will ruin signatures.
More to the point, if the message were not clearsigned, if it were
MIMEd, I would be unable to easily go to the trouble of verifying
the message using a text editor and GPG. I would have had to pry
open MIME parts,
Yes.
construct OpenPGP headers,
Eh? You don't need to construct any OpenPGP headers with PGP/MIME.
Nonetheless, I'd love to hear what you have to say about 8859-1
and 8859-15. I've gone and looked up 8859-15 (which I'd never
heard of before), and would like to hear your insights.
iso-8859-15 is the replacement for iso-8859-1 with the Euro sign
added (instead of the currency symbol), and with some characters
such as "LATIN {CAPITAL,SMALL} LETTER S WITH CARON", "LATIN
{CAPITAL,SMALL} LETTER Z WITH CARON" added instead of "ACUTE
ACCENT", "CEDILLA", "BROKEN BAR", "DIAERESIS". (I think there are
some more changes, but these are the ones I immediately recall.)
Obviously, the transformation between the two can't be guaranteed to
be loss-free. That is, as soon as you (have to) recode between
these, a signature may be broken.
To wrap things up:
- ASCII armor proper can be fixed by giving a clear specification of
the character set issues involved: Either mandate UTF-8, or
mandate tagging and use UTF-8 as the default. The current
language is considerably too fuzzy, and - I believe - mostly
ignored.
- With clearsigning, the "soft" failures (I suppose these are the
more common ones) can be avoided by either mandating that the
signed material is to be recoded from the local representation
(which must be known out-of-band) to utf-8 before signing and
verifying, or by mandating that the character set used when
generating the signature is indicated in an appropriate tag.
The problem with tagging is that implementations are encouraged to
use proprietary character sets. Probably, a note about being
conservatives about the character sets used should be added.
The problem with getting anything implemented is that NAI does not
support PGP any more.
Finally, hard failures of clearsigning: You can only avoid these by
making sure that no lossy recoding happens as the data travels from
signer to verifier. Encouraging people to use utf-8 on the wire (so
there is at least no lossy recoding on the sending side) may help,
but you won't get rid of all the problems that way.
Note that both kinds of clearsigning failures don't occur with
PGP/MIME: The signed material is invariant under the transformations
which can reasonably be expected to happen.
--
Thomas Roessler <roessler(_at_)does-not-exist(_dot_)org>
|
|