ietf-openpgp
[Top] [All Lists]

Re: [ISSUE] UTF-8 CRLF

2004-10-26 14:03:29

* David Shaw:

The last sentence of section 5.9 reads:

  Text data is stored with <CR><LF> text endings (i.e. network-normal
  line endings).  These should be converted to native line endings by
  the receiving software.

Suggest to add:

  For the 'u' UTF8 literal packet, the minimal UTF8 encoding for the
  <CR><LF> line endings SHOULD be used.  That is, 0x0D 0x0A and not
  0xC0 0x8D 0xC0 0x8A or other multibyte encodings.

This isn't valid UTF-8.  A UTF-8 implementation MUST NOT decode these
octets, but MUST flag an error.  The most recent UTF-8 RFC is quite
explicit in this regard.

The UTF-8 issue I mentioned previously arises because Unicode has
additional characters with line-ending semantics.  There used to be a
Unicode Technical Report on this topic, but it has been superseded by
section 5.8 in Unicode 4.0:

  <http://www.unicode.org/versions/Unicode4.0.0/ch05.pdf>


<Prev in Thread] Current Thread [Next in Thread>