Re: Let's resolve the end-of-line and whitespace question

1) 2440 says of the canonical text signature (sigclass 0x01):

        The signature is calculated over the text data with its line
        endings converted to <CR><LF> and trailing blanks removed.

   This is different than what every version of PGP though 8 does.
   These implementations do the <CR><LF> line endings, but do not
   remove trailing blanks (essentially PGP 2.x behavior).

I'm happy to consider this a bug, myself, if consensus says that's whatthe behavior should be. This is probably a case where there's code inthere that was written by either Derek or Peter Gutmann and no one'schanged it since then. :-)

2) 2440 says of the cleartext signature:

        Also, any trailing whitespace (spaces, and tabs, 0x09) at the
        end of any line is ignored when the cleartext signature is
        calculated.

   Again, PGP through 8 implements this differently than 2440 says,
   where trailing spaces are removed, but trailing tabs are not
   (again, PGP 2.x behavior).

I've seen comments that these details were inadvertent errors in 2440
that would have or should have been fixed, and requests to the WG the
change these two details in 2440bis to match the historical PGP
behavior (after all, PGP 5 predates 2440 and there is a huge installed
base of PGP 5-8).  I've also seen comments that the WG mustn't change
the published standard this many years after the fact to match
behavior already declared noncompliant.


No, this is not an inadvertent error. It was an intentional change.

PGP 5 did predate 2440 and there are a number of small changes betweenits behavior and what's in 2440. Moreover, this version is now sevenyears old and has not been supported in so long that I can't evenremember how long it has been unsupported. It is not 2440-compliant,can't be because it predated 2440, is not supported by its producer,and really isn't germane to any discussion of 2440. Please stopbringing it up. It's dead. Thank you, I feel much better now.

The 2440 change in text signatures (adding in whitespace trimming) wasone of a number of small things there that were debated as to what theright thing should be, rather than what went before. There are manygood reasons for removing trailing whitespace at the end of anythingthat's text mode. It's the sort of thing that gets mangled easily andundetectably, as well as a covert channel. (I come from an era in whichis was common practice for text editors to trim trailing whitespacewhen saving a file, and consider it a feature rather than a bug.)

However, I'd be perfectly happy to settle it once and for all by sayingonly normalize line ends, even. We can just not worry about thewhitespace. In short, if the consensus here is that howeverwell-meaning that change was, it was a bad idea, it's easy to fix. Ican see that Unicode issues might turn this into a swamp in ways thatjust trimming spaces and tabs isn't.

Jon Callas pointed out in...

I'll repeat a broader opinion I have on this. I think that text modeshould be an assurance that the signed data is in a particular format,so that it can be handled accurately, not a commentary on how tocanonicalize.

The reasons I think this is a good idea are that it be an assuranceinclude:

* If it's just an assurance, then you verify a signature the same wayfor text mode and binary mode. This is better for the code in a lot ofways.

* If it's just an assurance, and the assurance is wrong (because ofbugs or incompatibilities), then you end up with a good signature overdata that isn't exactly in the format you wanted. This is a much betterfailure mode than getting good data, but not knowing how to verify thedata. It's a fail-soft situation.

Jon