Gen-ART review of draft-ietf-marf-redaction-08

The -08 version is a significant improvement that aligns the draft's
recommendations on mechanisms for redaction and anonymization with the
situation-dependent levels of security that are appropriate for those
purposes.

idnits 2.12.13 didn't find anything.

The -08 version is ready for publication as a Standards Track RFC.

Thanks,
--David

-----Original Message-----
From: Black, David
Sent: Thursday, January 19, 2012 7:10 PM
To: ietf(_at_)cybernothing(_dot_)org; Murray S. Kucherawy; 
gen-art(_at_)ietf(_dot_)org; ietf(_at_)ietf(_dot_)org
Cc: marf(_at_)ietf(_dot_)org; presnick(_at_)qualcomm(_dot_)com; Black, David
Subject: Gen-ART review of draft-ietf-marf-redaction-05

Based on discussion with the authors, the -05 version of this draft resolves 
the
issues raised in the Gen-ART review of the -04 version.  An important element 
of
the approach taken to issue [1] has been to explain why the security 
requirements
for redaction are significantly weaker than the strength of the secure hashes
that are suggested by the draft.

Thanks,
--David

-----Original Message-----
From: Black, David
Sent: Tuesday, January 10, 2012 9:44 PM
To: ietf(_at_)cybernothing(_dot_)org; Murray S. Kucherawy; 
gen-art(_at_)ietf(_dot_)org; ietf(_at_)ietf(_dot_)org
Cc: Black, David; marf(_at_)ietf(_dot_)org; presnick(_at_)qualcomm(_dot_)com
Subject: Gen-ART review of draft-ietf-marf-redaction-04

I am the assigned Gen-ART reviewer for this draft. For background on 
Gen-ART, please
see the FAQ at <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call comments you 
may receive.

Document: draft-ietf-marf-redaction-04
Reviewer: David L. Black
Review Date: January 10, 2012
IETF LC End Date: January 18, 2011
IESG Telechat Date: January 19, 2011

Summary: This draft is on the right track but has open issues, described in 
the review.

This draft specifies a method for redacting information from email abuse 
reports
(e.g., hiding the local part [user] of an email address), while still 
allowing
correlation of the redacted information across related abuse reports from 
the same
source. The draft is short, clear, and well written.

There are two open issues:

[1] The first open issue is the absence of security guidance to ensure that 
this
redaction technique effectively hides the redacted information.  The 
redaction
technique is to concatenate a secret string (called the "redaction key") to 
the
information to be redacted, apply "any hashing/digest algorithm", convert 
the output
to base64 and use that base64 string to replace the redacted information.

There are two important ways in which this technique could fail to 
effectively hide
the redacted information:
    - The secret string may inject insufficient entropy.
    - The hashing/digest algorithm may be weak.

To take an extreme example, if the secret string ("redaction key") consists 
of a
single ASCII character, and a short email local part is being redacted, 
then the
output is highly vulnerable to dictionary and brute force attacks because 
only 6 bits
of entropy are added (the result may look secure, but it's not).  Beyond 
this extreme
example, this is a potentially real concern - e.g., applying the rule of 
thumb that
ASCII text contains 4-5 bits of entropy per character, the example in 
Appendix A
uses a "redaction key" of "potatoes" that injects at most 40 bits of 
entropy -
is that sufficient for email redaction purposes?

To take a silly example, if a CRC is used as the hash with that sort of 
short input,
the result is not particularly difficult to invert.

I suggest a couple of changes:
1) Change "any hashing/digest algorithm" to require use of a secure hash, 
and
    explain what is meant by "secure hash" in the security considerations 
section.
2) Require a minimum length of the "redaction key" string, and strongly 
suggest
    (SHOULD) that it be randomly generated (e.g., by running sufficient 
output
    of an entropy-rich random number generator through a base64 converter).

For the latter change, figure out the amount of entropy that should be used
for redaction - the recommended string length will be larger because 
printable
ASCII is not entropy-dense (at best it's good for 6 bits of entropy in each
8-bit character, and human-written text such as this message has 
significantly
less).

From a pure security perspective, use of HMAC with specified secure hashes
(SHA2-family) and an approach of hashing the "redaction key" down to a 
binary
key for HMAC would be a stronger approach. I suggest that authors consider
approach, but  there may be practical usage concerns that suggest not 
adopting it.

[2] The second open issue is absence of security considerations for the 
redaction
key.  The security considerations section needs to caution that the 
redaction key
is a secret key that must be managed and protected as a secret key.  
Disclosure
of a redaction key removes the redaction from all reports that used that 
key.
As part of this, guidance should be provided on when and how to change the
redaction key in order to limit the effects of loss of secrecy for a single
redaction key.

Editorial Nit: I believe that "anonymization" is a better description of 
what
this draft is doing (as opposed to "redaction"), particularly as the result 
is
intended to be correlatable via string match across reports from the same 
source.

idnits 2.12.13 didn't find any nits.

Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
david(_dot_)black(_at_)emc(_dot_)com        Mobile: +1 (978) 394-7754
----------------------------------------------------


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www.ietf.org/mailman/listinfo/ietf