I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART,
see the FAQ at <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
Please resolve these comments along with any other Last Call comments you may
Reviewer: David L. Black
Review Date: January 10, 2012
IETF LC End Date: January 18, 2011
IESG Telechat Date: January 19, 2011
Summary: This draft is on the right track but has open issues, described in the
This draft specifies a method for redacting information from email abuse reports
(e.g., hiding the local part [user] of an email address), while still allowing
correlation of the redacted information across related abuse reports from the
source. The draft is short, clear, and well written.
There are two open issues:
 The first open issue is the absence of security guidance to ensure that this
redaction technique effectively hides the redacted information. The redaction
technique is to concatenate a secret string (called the "redaction key") to the
information to be redacted, apply "any hashing/digest algorithm", convert the
to base64 and use that base64 string to replace the redacted information.
There are two important ways in which this technique could fail to effectively
the redacted information:
- The secret string may inject insufficient entropy.
- The hashing/digest algorithm may be weak.
To take an extreme example, if the secret string ("redaction key") consists of a
single ASCII character, and a short email local part is being redacted, then the
output is highly vulnerable to dictionary and brute force attacks because only
of entropy are added (the result may look secure, but it's not). Beyond this
example, this is a potentially real concern - e.g., applying the rule of thumb
ASCII text contains 4-5 bits of entropy per character, the example in Appendix A
uses a "redaction key" of "potatoes" that injects at most 40 bits of entropy -
is that sufficient for email redaction purposes?
To take a silly example, if a CRC is used as the hash with that sort of short
the result is not particularly difficult to invert.
I suggest a couple of changes:
1) Change "any hashing/digest algorithm" to require use of a secure hash, and
explain what is meant by "secure hash" in the security considerations
2) Require a minimum length of the "redaction key" string, and strongly suggest
(SHOULD) that it be randomly generated (e.g., by running sufficient
of an entropy-rich random number generator through a base64 converter).
For the latter change, figure out the amount of entropy that should be used
for redaction - the recommended string length will be larger because printable
ASCII is not entropy-dense (at best it's good for 6 bits of entropy in each
8-bit character, and human-written text such as this message has significantly
From a pure security perspective, use of HMAC with specified secure hashes
(SHA2-family) and an approach of hashing the "redaction key" down to a binary
key for HMAC would be a stronger approach. I suggest that authors consider
approach, but there may be practical usage concerns that suggest not adopting
 The second open issue is absence of security considerations for the
key. The security considerations section needs to caution that the redaction
is a secret key that must be managed and protected as a secret key. Disclosure
of a redaction key removes the redaction from all reports that used that key.
As part of this, guidance should be provided on when and how to change the
redaction key in order to limit the effects of loss of secrecy for a single
Editorial Nit: I believe that "anonymization" is a better description of what
this draft is doing (as opposed to "redaction"), particularly as the result is
intended to be correlatable via string match across reports from the same
idnits 2.12.13 didn't find any nits.
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA 01748
+1 (508) 293-7953 FAX: +1 (508) 293-7786
david(_dot_)black(_at_)emc(_dot_)com Mobile: +1 (978) 394-7754
Ietf mailing list