ietf-dkim
[Top] [All Lists]

[ietf-dkim] ISSUE: non-ascii header text

2011-04-13 01:35:10
Oops, this is a separate issue.  But I hope it's also not
contentious.

3.5, d= and i= tags: references to RFC3490 should be RFC5890. The
reference to ToASCII() should go, or else in both places say IDNs are
represented as A-labels.

Suggested new language under d= on page 22:

Change: Internationalized domain names MUST be encoded as described in
      [RFC3490].

To:  Internationalized domain names MUST be represented as A-labels
     as described in [RFC5890].

Suggested new language under i= on page 23:

Change: Internationalized domain names MUST be converted using the steps
      listed in Section 4 of [RFC3490] using the "ToASCII" function.

To:  Internationalized domain names MUST be represented as A-labels
     as described in [RFC5890].

3.5, z= tag, remove paragraph "Header fields with characters requiring
 conversion (perhaps from legacy MTAs that are not [RFC5322] compliant)
 SHOULD be converted as described in MIME Part Three [RFC2047]."

DKIM only applies to RFC5322 compliant messages, RFC2047 does not
provide conversions for all of the fields that can be copied in a z=
header, and as soon as the EAI RFCs come out, which is likely to be
soon, this advice will be wrong anyway.


Free extra bonus confusion: the EAI WG is working on 5322 extensions
that basically allow UTF-8 anywhere in messages handled by EAI-aware
mail software, specifically including anywhere in an e-mail address.
(That is, the domain part does not have to be an A-label.)  I think it
is reasonable to assume that a DKIM-Signature in an EAI message can
include UTF-8 characters in any data field where it makes sense, e.g.,
d=, i=, s=, z=.  DKIM-quoted-printable doesn't need to change since
there's no need to quote any non-ASCII UTF-8 characters.

The only thing that's not obvious to me is whether the hash functions
should hash the bytes of the UTF-8, or convert them to UTF wide
characters and hash those.  Depending on the way the MTA is written,
either might seem more "natural", but I'm inclined to say you hash the
UTF-8 bytes because the SHA-1 and SHA-256 hash functions are defined
on bytes, not wider things.

R's,
John

_______________________________________________
NOTE WELL: This list operates according to 
http://mipassoc.org/dkim/ietf-list-rules.html