[Top] [All Lists]

Clearsigning considered harmful.

2002-04-16 06:44:49

Hash: SHA1

 From draft-04, section 6.2:

"Charset", a description of the character set that the plaintext is 
in. Please note that OpenPGP defines text to be in UTF-8 by default. 
An implementation will get best results by translating into and out 
of UTF-8. However, there are many instances where this is easier 
said than done. Also, there are communities of users who have no 
need for UTF-8 because they are all happy with a character set like 
ISO Latin-5 or a Japanese character set. In such instances, an 
implementation MAY override the UTF-8 default by using this header 
key. An implementation MAY implement this key and any translations 
it cares to; an implementation MAY ignore it and assume all text is 

I believe that this part of the specification is a recipe for  
interoperability desasters of the most interesting kind.

First of all, the specification makes no guarantee about the kind of 
character set used for generating a cleartext signature.  This  
character set MAY be indicated by the Charset header, but it doesn't 
have to.  Implementations MAY also ignore the header.

Pretending that this header is not there is not a problem as long as 
the cleartext's character set is not changed in any way between  
signature creation and verification.

However, this assumption won't hold in reality.  Different systems  
are using different character sets; in order to properly display an  
e-mail message, you will have to recode it.  Just think about the 
different ways to add the Euro symbol to character sets, or think 
about using utf-8 for messages.

All this is getting particularly bad in the context of cleartext  
signatures: In this case, people will frequently use cut & paste in  
order to pass the signature and signed material to the verification  
service.  In order to correctly verify a signature, the verification 
service would have to re-encode it, and hope that the original  
signed material is restored.  In order to do this, implementations  
need to know what encoding was used when the original signature was  
generated.  I.e., they will have to generate _and_ respect a  
Charset header.

(Note, however, that re-encodings in a way which assures correct  
signature verification may not be possible if the encoding for  
display purposes was lossy. For instance, you cannot recode  
loss-free between iso-8859-1 and iso-8859-15 in all cases.  Both  
character sets are being used in the European Union, with, it seems, 
iso-8859-15 gaining momentum.)

My suggestion is that we either introduce mandatory rules for the  
character set issues, or add clear language which indicates that  
clearsigned signatures impose cnsiderable interoperability risks  
when used outside closed environments with homogeneous software  

- -- 
Thomas Roessler                        <roessler(_at_)does-not-exist(_dot_)org>
Version: GnuPG v1.0.6 (GNU/Linux)


<Prev in Thread] Current Thread [Next in Thread>
  • Clearsigning considered harmful., Thomas Roessler <=