RE: Literal packets and canonicalization


Thank you for the answer David. If, as the RFC states, we canoncialize
the data before storing it in the literal packet, then the
implementation is tampering with the file before performing the
operation, say encryption. When I use GPG to encrypt and decrypt a text
file, the checksums of the source text file and the decrypted file are
the same. So, the file in not being canonicalized prior to encryption?

-----Original Message-----
From: owner-ietf-openpgp(_at_)mail(_dot_)imc(_dot_)org
[mailto:owner-ietf-openpgp(_at_)mail(_dot_)imc(_dot_)org] On Behalf Of David 
Shaw
Sent: Thursday, May 06, 2004 3:56 PM
To: ietf-openpgp(_at_)imc(_dot_)org
Subject: Re: Literal packets and canonicalization


On Thu, May 06, 2004 at 03:10:17PM -0400, Hasnain Mujtaba wrote:


Hello,

I was reading section "5.9. Literal Data Packet" of RFC2440 and I had

question: What are the concequences of not canonicalizing text data
before storing it in a literal packet and using the literal packet to
form either an encrypted packet or signature packet?


The file should decrypt properly, and (at least in PGP and GnuPG)
signatures should verify properly regardless of the canonicalization.

What if the sender marks all literal data as binary 'b', even if the
literal data is text?


The bad thing that will happen is that recipients on platforms that
have a different text line ending convention than the sender will see
somewhat mangled text in the output.

For example: Macs generally end lines with CR.  Unix machines
generally end lines with LF.  Sending data from one to the other
without the benefit of canonicalization results in one very long
"line" with occasional CRs or LFs in there.  Some text editor/viewer
programs do heuristics to detect and fix this problem, but it's
generally better to canonicalize which lets the OpenPGP program handle
it automatically.

David