Re: [IAB] Call for Comment: 'Privacy Considerations for Internet Protoco

Hi Dave,

Thanks for your review. Some comments are inline. A pre-publication -08 version 
is available at 
<http://www.alissacooper.com/files/draft-iab-privacy-considerations-08.txt>. 
The diff from the -07 is available at <https://www.cdt.org/Z4Q>. 

On Mar 14, 2013, at 10:04 AM, Dave Crocker <dhc(_at_)dcrocker(_dot_)net> wrote:

Apologies for my sending this after the deadline.  I hope the comments are 
still usable...



Review of:    Privacy Considerations for Internet Protocols

I-D:          draft-iab-privacy-considerations-07.txt

Reviewed by:  D. Crocker

Review date:  14 March 2013


Summary:

  The document provides a broad introduction to the needs, nature and details 
of adding privacy considerations to IETF specifications. Broadly, it is 
divided into introduction, terminology, generic exposure/analysis model, 
threats, mitigations, and analysis guidelines.  The document is generally 
well-organized and written clearly.  An example analysis is provided that 
concretely demonstrates the approach to doing a considerations analysis; it 
was intentionally chosen as a difficult case, with inherent tradeoffs between 
privacy and required functionality.

  As an introduction to the topic, the document is accessible and practical.

  A glaring deficiency of the document is its conscious choice to refrain 
from defining the term 'privacy'.  The choice is understandable, a given 
long, messy and varying real-world history with term. However the reader is 
left with having to formulate their own -- possibly unvoiced and therefore 
entirely ambiguous -- working definition.  For doing the technical worked 
needed in a specification, this simply does not give the reader the linchpin 
to the topic, needed to anchor their understanding in a way that will be 
consistent across authors and readers of specifications.  The draft needs to 
choose a definition, in spite of the fact that other groups, people and 
contexts will use other definitions.  We do specifications and this starts 
with definitions.  It simply makes no sense to be missing a definition for 
the key word.

By way of priming that pump, I'll proffer the simplest definition that seems 
plausible here:

    Privacy is the concern for protecting information
    of or about an individual person.

Tweak this or replace it entirely, but /please/ provide a concrete, pragmatic 
definition that explicitly defines what is in scope and what is out, for them 
to focus their considerations on.


This suggestion has been debated at length within the IAB privacy program over 
the life of this document. Our thinking is that trying to define "privacy" in 
one sentence would be as counterproductive as trying to define "security" or 
"extensibility" in one sentence. All of those concepts are rich and nuanced 
enough to have entire documents dedicated to explaining them as concepts and 
exploring how those concepts should be tackled in the IETF. That is the purpose 
of this document. Given the extent to which we outline all the different facets 
of privacy threats, the feeling is that it would undercut the value of the 
document to boil privacy down to one sentence. What we want readers to do is 
take in the nuance and not think of privacy as one box they can simply tick off 
during the design process.


Also, given the challenges of this topic and the desire to get useful privacy 
considerations into IETF work, I suggest creating a privacy directorate, 
which can be asked to assist authors and review their work.  Think of it as a 
topic-specific mentoring group…


This has been tried before and did not work out so well 
<http://www.ietf.org/mail-archive/web/privacydir/current/maillist.html>, but 
there is some talk of trying again.


Except for the requirement to define its motivating term, the draft is usable 
in its current form, although a number of specific improvements cited in the 
detailed comments are recommended.


Great!




Detailed Comments:

The following comments are left raw, written as I read the draft...

Abstract

  This document offers guidance for developing privacy considerations
  for inclusion in protocol specifications.  It aims to make protocol
  designers aware of privacy-related design choices.  It suggests that
  whether any individual RFC warrants a specific privacy considerations
  section will depend on the document's content.


Given the degree of ambiguity in the word 'privacy' -- since there is such a 
wide range of definitions people assign it, as noted in the second paragraph 
of the Introduction -- the Abstract needs to provide a summary of its 
definition here, so that the reader can understand the focus and scope of the 
term's use in this document.  The definitional text needs to refrain from 
using the word 'privacy' as part of the definition…


Per my comment above, it is probably better to avoid stating this as concisely 
as would be necessary in the abstract.

1. Introduction


  [RFC3552] provides detailed guidance to protocol designers about both
  how to consider security as part of protocol design and how to inform
  readers of protocol specifications about security issues.  This
  document intends to provide a similar set of guidance for considering
  privacy in protocol design.

  Privacy is a complicated concept with a rich history that spans many
  disciplines.  With regard to data, often it is a concept applied to


"With regard to data" implies that it could be with regard to something else. 
 What?


Peeping toms, for example. In many circles a distinction is made between "data 
protection" and "privacy," which can comprise aspects of personal intrusion 
that are not associated with stored or transmitted data.

  "personal data," information relating to an identified or
  identifiable individual.  Many sets of privacy principles and privacy
  design frameworks have been developed in different forums over the
  years.  These include the Fair Information Practices [FIPs], a
  baseline set of privacy protections pertaining to the collection and
  use of personal data (often based on the principles established in
  [OECD], for example), and the Privacy by Design concept, which
  provides high-level privacy guidance for systems design (see [PbD]
  for one example).  The guidance provided in this document is inspired
  by this prior work, but it aims to be more concrete, pointing
  protocol designers to specific engineering choices that can impact
  the privacy of the individuals that make use of Internet protocols.

  Different people have radically different conceptions of what privacy
  means, both in general, and as it relates to them personally
  [Westin].  Furthermore, privacy as a legal concept is understood
  differently in different jurisdictions.  The guidance provided in
  this document is generic and can be used to inform the design of any
  protocol to be used anywhere in the world, without reference to
  specific legal frameworks.

  Whether any individual document warrants a specific privacy
  considerations section will depend on the document's content.
  Documents whose entire focus is privacy may not merit a separate


OK.  Enough is enough.  It's fine to have a quick survey of earlier work, but 
that's not sufficient.

You keep using the word privacy, and I don't know what you mean.

The typical writer and reader of RFCs is not experienced in the topic of 
privacy.  They won't know what you mean either:  they need very concrete 
guidance about the word's meaning.

Telling me that different people mean different things with the term merely 
assures me that I have no idea what /you/ mean unless you tell me.  Having 
each reader make guesses about the meaning is a way to ensure 
non-interoperability of the construct.


Per my note above, the expectation is that the document as a whole will provide 
a rich explanation of what is meant by privacy. Of course, some people won't 
read the whole document, or even parts of it, but others will, and hopefully 
more so over time.


Guidance can't be very helpful if the reader has no idea when to apply it.


If the reader is unsure about whether to go through the thought process 
outlined in section 7, there is no harm (other than the use of the reader's 
time) in doing it and then finding out that a particular specification is 
already solidly designed when it comes to privacy.

  section (for example, "Private Extensions to the Session Initiation
  Protocol (SIP) for Asserted Identity within Trusted Networks"
  [RFC3325]).  For certain specifications, privacy considerations are a
  subset of security considerations and can be discussed explicitly in


I strongly suggest that any explicit privacy discussion be required to be an 
entirely separate from the 'security considerations' section.

My reasoning is simple:  This community sees 'security' in terms of 
encryption and signing, traffic analysis, and other such mechanical, 
relatively low-level components.  Privacy is an entirely different and 
broader and more human beast, even when its details devolve to these familiar 
mechanics.

At the least, making it a separate section will help writers and readers to 
distinguish privacy from the security stuff we are used to seeing discussed.


I think sections 4 and 7 demonstrate that privacy and security are interrelated 
at least in some respects. 

There are two motivations for suggesting that privacy can be incorporated into 
security considerations in some cases. First, in a way we are trying to key off 
of the familiarity that people already have with security, and asking them to 
expand their security thinking a bit might be an easier sell than making a 
whole new/separate requirement. Second, it is a means to avoid duplication. It 
already happens that when authors insert a separate privacy considerations 
section it ends up making a bunch of references to the security considerations 
section. We don't want to recommend a document structure that will just end up 
seeming extraneous.

  the security considerations section.  Some documents will not require
  discussion of privacy considerations (for example, "Definition of the
  Opus Audio Codec" [RFC6716]).  The guidance provided here can and
  should be used to assess the privacy considerations of protocol,
  architectural, and operational specifications and to decide whether
  those considerations are to be documented in a stand-alone section,
  within the security considerations section, or throughout the
  document.


Not sure whether this is a question or a suggestion; if it's the latter, I'm 
not sure what to suggest:  privacy issues often develop as a combinatorial 
problem -- 'correlation' as you note farther down -- that is, developing out 
of unpredicted integration of information from discrete services.  While any 
specific IETF specification might have its own, direct privacy issues needing 
consideration, where should discussion of these combinatorial dangers be 
discussed?


That is a good question and I'm not sure I know the answer. Of course there is 
nothing to prevent people from writing drafts about the privacy considerations 
associated with the combination of discrete services/protocols.

2. Terminology


  This section defines basic terms used in this document, with
  references to pre-existing definitions as appropriate.  As in
  [RFC4949], each entry is preceded by a dollar sign ($) and a space
  for automated searching.  Note that this document does not try to
  attempt to define the term 'privacy' itself.  Instead privacy is the
  sum of what is contained in this document.  We therefore follow the
  approach taken by [RFC3552].


Sorry.  Not workable, if you want meaningful consideration by authors and 
meaningful understanding by readers.


See above.


2.1. Entities


  Several of these terms are further elaborated in Section 3.

  $ Attacker:   An entity that intentionally works against some privacy
     protection goal.  Unlike observers, attackers' behavior is
     unauthorized.


This precludes accidental privacy violations?


Fixed.


  $ Eavesdropper:   A type of attacker that passively observes an
     initiator's communications without the initiator's knowledge or
     authorization.  See [RFC4949].

  $ Enabler:   A protocol entity that facilitates communication between
     an initiator and a recipient without being directly in the
     communications path.


For example…?


This is elaborated in section 3.

2.3. Identifiability

...

  $ Personal Name:   A natural name for an individual.  Personal names
     are often not unique, and often comprise given names in
     combination with a family name.  An individual may have multiple
     personal names at any time and over a lifetime, including official
     names.  From a technological perspective, it cannot always be
     determined whether a given reference to an individual is, or is
     based upon, the individual's personal name(s) (see Pseudonym).


Official Names also are typically not unique.


Added a note to this affect.

  $ Pseudonym:   A name assumed by an individual in some context,
     unrelated to the individual's personal names known by others in
     that context, with an intent of not revealing the individual's
     identities associated with her other names.


(Might be worth mentioning that this is sometimes called "persona".)

Pseudonyms also often are not unique.

My point is that it's good that you mentioned this issue and should repeat it 
for each term to which it applies.


Fixed.

3. Communications Model


  To understand attacks in the privacy-harm sense, it is helpful to
  consider the overall communication architecture and different actors'
  roles within it.  Consider a protocol entity, the "initiator," that
  initiates communication with some recipient.  Privacy analysis is
  most relevant for protocols with use cases in which the initiator
  acts on behalf of an individual (or different individuals at
  different times).  It is this individual whose privacy is potentially
  threatened.


If I receive a credit dunning notice or a legal notification, I'm the 
recipient, but unauthorized disclosure of such messages would be privacy-harm 
for me.  It isn't just initiator-side individuals.


Fixed.


  Communications may be direct between the initiator and the recipient,
  or they may involve an application-layer intermediary (such as a
  proxy or cache) that is necessary for the two parties to communicate.


proxy or cache -> proxy, cache or (mail) relay


Fixed.

  In some cases this intermediary stays in the communication path for
  the entire duration of the communication and sometimes it is only
  used for communication establishment, for either inbound or outbound
  communication.  In rare cases there may be a series of intermediaries


For email, it isn't rare at all.  In fact, it's universal, probably for 
literally every email sent.


Fixed.

  that are traversed.  At lower layers, additional entities are
  involved in packet forwarding that may interfere with privacy
  protection goals as well.

...

  Protocol design is often predicated on the notion that recipients,
  intermediaries, and enablers are assumed to be authorized to receive
  and handle data from initiators.  As [RFC3552] explains, "we assume
  that the end-systems engaging in a protocol exchange have not



Cooper, et al.           Expires August 27, 2013               [Page 10]


Internet-Draft           Privacy Considerations            February 2013


  themselves been compromised."  However, by its nature privacy


which nature?

seriously, how is the reader to know (or even guess) what exactly is being 
implied?


Fixed.

  analysis requires questioning this assumption since systems are often
  compromised for the purpose of obtaining personal data.

  Although recipients, intermediaries, and enablers may not generally
  be considered as attackers, they may all pose privacy threats
  (depending on the context) because they are able to observe, collect,


exactly!

4. Privacy Threats

...

  This section lists common privacy threats (drawing liberally from
  [Solove], as well as [CoE]), showing how each of them may cause
  individuals to incur privacy harms and providing examples of how
  these threats can exist on the Internet.

  Some privacy threats are already considered in IETF protocols as a


cite some examples.


These are explained throughout section 4. This is just the introductory text to 
the section.

  matter of routine security analysis.  Others are more pure privacy


What does it mean to be a "more pure privacy threat"?  Really, I can't guess.


Same as above -- this is explained throughout section 4.

  threats that existing security considerations do not usually address.
  The threats described here are divided into those that may also be
  considered security threats and those that are primarily privacy
  threats.

  Note that an individual's awareness of and consent to the practices
  described below may change an individual's perception of and concern
  for the extent to which they threaten privacy.  If an individual
  authorizes surveillance of his own activities, for example, the
  individual may be able to take actions to mitigate the harms
  associated with it, or may consider the risk of harm to be tolerable.

4.1. Combined Security-Privacy Threats


The fact that you have a string like "Combined Security-Privacy" supports the 
view that Privacy Considerations is distinct from Security and should not be 
in the Security Considerations section…


Actually I think it's the opposite. Section 4 makes concrete distinctions 
between privacy threats that are already commonly covered by security 
considerations and those that are not.

4.1.4. Misattribution


  Misattribution occurs when data or communications related to one
  individual are attributed to another.  Misattribution can result in
  adverse reputational, financial, or other consequences for
  individuals that are misidentified.


It's probably worth mentioning that for spam, this is often called "spoofing".


Fixed.

5.1. Data Minimization

...

  However, the most direct application of data minimization to protocol
  design is limiting identifiability.  Reducing the identifiability of
  data by using pseudonyms or no identifiers at all helps to weaken the
  link between an individual and his or her communications.  Allowing
  for the periodic creation of new identifiers reduces the possibility


also randomization of chosen identifiers


Fixed.

5.2. User Participation


  As explained in Section 4.2.5, data collection and use that happens
  "in secret," without the individual's knowledge, is apt to violate
  the individual's expectation of privacy and may create incentives for
  misuse of data.  As a result, privacy regimes tend to include
  provisions to require informing individuals about data collection and
  use and involving them in decisions about the treatment of their
  data.  In an engineering context, supporting the goal of user
  participation usually means providing ways for users to control the
  data that is shared about them.  It may also mean providing ways for
  users to signal how they expect their data to be used and shared.


There is a serious downside to this.  It presumes that this burden on users 
is reasonable.  For many scenarios, it isn't.  Rather, the focus on user 
participation is often used as an alternative to the difficult work (or 
research) on mechanisms that require less user participation.


I agree that sole reliance on user participation is undesirable. It's listed 
here as one of several protections, so sole reliance is not implied. For 
protocol design I would actually argue that we tend not to think about user 
participation enough (whereas I agree that privacy policy tends to focus on it 
too much).

6. Scope of Privacy Implications of Internet Protocols


  Internet protocols are often built flexibly, making them useful in a
  variety of architectures, contexts, and deployment scenarios without
  requiring significant interdependency between disparately designed
  components.  Although protocol designers often have a particular
  target architecture or set of architectures in mind at design time,
  it is not uncommon for architectural frameworks to develop later,
  after implementations exist and have been deployed in combination
  with other protocols or components to form complete systems.


Independent of the purpose of this draft, the above paragraph is quite a nice 
bit of text about an aspect of IETF technical work.


Thanks.

  As a consequence, the extent to which protocol designers can foresee
  all of the privacy implications of a particular protocol at design
  time is limited.  An individual protocol may be relatively benign on
  its own, and it may make use of privacy and security features at
  lower layers of the protocol stack (Internet Protocol Security,
  Transport Layer Security, and so forth) to mitigate the risk of
  attack.  But when deployed within a larger system or used in a way
  not envisioned at design time, its use may create new privacy risks.
  Protocols are often implemented and deployed long after design time
  by different people than those who did the protocol design.  The
  guidelines in Section 7 ask protocol designers to consider how their
  protocols are expected to interact with systems and information that
  exist outside the protocol bounds, but not to imagine every possible
  deployment scenario.

  Furthermore, in many cases the privacy properties of a system are
  dependent upon the complete system design where various protocols are
  combined together to form a product solution; the implementation,
  which includes the user interface design; and operational deployment
  practices, including default privacy settings and security processes
  within the company doing the deployment.  These details are specific
  to particular instantiations and generally outside the scope of the
  work conducted in the IETF.  The guidance provided here may be useful
  in making choices about these details, but its primary aim is to
  assist with the design, implementation, and operation of protocols.


Perhaps the largest challenge I repeatedly see in the IETF is what I call 
"systems thinking", which is considering an integrated set of components and 
their interactions.  The above three paragraphs very nicely target exactly 
that scope of concern, in the context of privacy.

So I /strongly/ suggest you move the three paragraphs up to the Introduction. 
 Note that this would largely resolve the concern I raised there, that the 
Introduction really doesn't introduce cross-component (multi-specification) 
scoping issues for privacy.  Add a citation in it to this section.


This section has been moved to directly after the introduction.

  Transparency of data collection and use -- often effectuated through
  user interface design -- is normally a key factor in determining the


I realize that's a common view, but has it been validated or is it merely the 
default perspective that user permission solves everything?


Good point. I've added some text to indicate that this is what often happens, 
whether rightly or wrongly.

  privacy impact of a system.  Although most IETF activities do not
  involve standardizing user interfaces or user-facing communications,
  in some cases understanding expected user interactions can be
  important for protocol design.  Unexpected user behavior may have an
  adverse impact on security and/or privacy.


While a generically reasonable view, the challenge with its application in 
the IETF is our general tendency to think that we understand UI and UX 
issues, although few in the IETF actually have the background for it.  For 
example we tend to think that simply giving users more information is a 
universal palliative.  Most discussions here about "expected user 
interactions" are simply wrong.  Worse, I've no idea what to suggest to 
counter this for the draft.


Yeah, I'm not sure the draft can fix this problem. But agree that it's a 
problem.

7. Guidelines


  This section provides guidance for document authors in the form of a
  questionnaire about a protocol being designed.  The questionnaire may
  be useful at any point in the design process, particularly after
  document authors have developed a high-level protocol model as
  described in [RFC4101].

  Note that the guidance does not recommend specific practices.  The
  range of protocols developed in the IETF is too broad to make
  recommendations about particular uses of data or how privacy might be
  balanced against other design goals.  However, by carefully
  considering the answers to each question, document authors should be
  able to produce a comprehensive analysis that can serve as the basis
  for discussion of whether the protocol adequately protects against
  privacy threats.


For some years after Security Considerations were made mandatory, authors 
mostly floundered with the topic, given their/our lack of background for 
assessing security considerations.  Eventually there was IETF focus on making 
the section useful.

While this draft goes a long way to making the nature and requirements of a 
Privacy Considerations section substantive, it's going to be some time before 
the community develops helpful skills at writing these sections.


Agree.

I suggest setting up a Privacy Directorate, essentially as a 
consulting/review service for authors to use in developing their text for the 
section in their documents.  The Directorate might also take initiative at 
reviewing new documents.


Perhaps this can be resurrected.

  The framework is divided into four sections that address each of the
  mitigation classes from Section 5, plus a general section.  Security
  is not fully elaborated since substantial guidance already exists in
  [RFC3552].

7.1. Data Minimization


     a.  Identifiers.  What identifiers does the protocol use for
     distinguishing initiators of communications?  Does the protocol
     use identifiers that allow different protocol interactions to be
     correlated?  What identifiers could be omitted or be made less
     identifying while still fulfilling the protocol's goals?


I'd think that retention of recipient identifiers might also be an issue?


This is covered in 7.1g.

     b.  Data.  What information does the protocol expose about
     individuals, their devices, and/or their device usage (other than
     the identifiers discussed in (a))?  To what extent is this
     information linked to the identities of the individuals?  How does
     the protocol combine personal data with the identifiers discussed
     in (a)?

     c.  Observers.  Which information discussed in (a) and (b) is
     exposed to each other protocol entity (i.e., recipients,
     intermediaries, and enablers)?  Are there ways for protocol
     implementers to choose to limit the information shared with each
     entity?  Are there operational controls available to limit the
     information shared with each entity?

     d.  Fingerprinting.  In many cases the specific ordering and/or
     occurrences of information elements in a protocol allow users,
     devices, or software using the protocol to be fingerprinted.  Is
     this protocol vulnerable to fingerprinting?  If so, how?  Can it



Cooper, et al.           Expires August 27, 2013               [Page 25]


Internet-Draft           Privacy Considerations            February 2013


     be designed to reduce or eliminate the vulnerability?  If not, why
     not?

     e.  Persistence of identifiers.  What assumptions are made in the
     protocol design about the lifetime of the identifiers discussed in
     (a)?  Does the protocol allow implementers or users to delete or
     replace identifiers?  How often does the specification recommend
     to delete or replace identifiers by default?  Can the identifiers,
     along with other state information, be set to automatically
     expire?

     f.  Correlation.  Does the protocol allow for correlation of
     identifiers?  Are there expected ways that information exposed by


Is it productive to also look for 'unexpected' ways?  This could be a silly 
and wasteful exercise, or thinking creatively about strange combinations 
might trigger better insight.  I've no direct experience, so can't judge.


I think eventually that might be something we want to load onto protocol 
designers, but at this point even just scoping this to expected ways would be 
helpful IMO.

8. Example

...

  The fundamental architecture defined in RFC 2778 and RFC 3859 is a
  mediated one.  Clients (presentities in RFC 2778 terms) publish their
  presence information to presence servers, which in turn distribute
  information to authorized watchers.  Presence servers thus retain
  presence information for an interval of time, until it either changes
  or expires, so that it can be revealed to authorized watchers upon
  request.  This architecture mirrors existing pre-standard deployment
  models.  The integration of an explicit authorization mechanism into
  the presence architecture has been widely successful in involving the
  end users in the decision making process before sharing information.
  Nearly all presence systems deployed today provide such a mechanism,
  typically through a reciprocal authorization system by which a pair
  of users, when they agree to be "buddies," consent to divulge their
  presence information to one another.  Buddylists are managed by
  servers but controlled by end users.  Users can also explicitly block
  one another through a similar interface, and in some deployments it
  is desirable to provide "polite blocking" of various kinds.


As the discussion moves into the details of analyzing each type of privacy 
concern, I suggest making the format be bulleted and/or tabular.  This will 
make each segment of analysis more accessible to the reader and easier to 
correlate with the lists of privacy concerns/attributes provided earlier in 
the document.  It will also aid scanning for review and later consultation.


The goal of this section was to review how privacy decisions were made within 
the confines of one example architecture. In the IAB privacy program we plan to 
try to apply the guidance in more formal write-ups in the manner you suggest to 
some other existing protocols/architectures (likely selected from the reviews 
we've already done: 
<http://www.iab.org/activities/programs/privacy-program/privacy-reviews/>).

  From a perspective of privacy design, however, the classical presence
  architecture represents nearly a worst-case scenario.  In terms of



Cooper, et al.           Expires August 27, 2013               [Page 28]


Internet-Draft           Privacy Considerations            February 2013


  data minimization, presentities share their sensitive information
  with presence services, and while services only share this presence
  information with watchers authorized by the user, no technical
  mechanism constrains those watchers from relaying presence to further


Offhand, I don't know what mechanisms are practical to impose such a 
constraint, in a protocol specification.  It would help to see an example.


I don't think the implication is that they exist.

  third parties.  Any of these entities could conceivably log or retain
  presence information indefinitely.  The sensitivity cannot be
  mitigated by rendering the user anonymous, as it is indeed the
  purpose of the system to facilitate communications between users who
  know one another.  The identifiers employed by users are long-lived
  and often contain personal information, including personal names and
  the domains of service providers.  While users do participate in the
  construction of buddylists and blacklists, they do so with little
  prospect for accountability: the user effectively throws their
  presence information over the wall to a presence server that in turn
  distributes the information to watchers.  Users typically have no way
  to verify that presence is being distributed only to authorized
  watchers, especially as it is the server that authenticates watchers,
  not the end user.  Connections between the server and all publishers
  and consumers of presence data are moreover an attractive target for
  eavesdroppers, and require strong confidentiality mechanisms, though
  again the end user has no way to verify what mechanisms are in place
  between the presence server and a watcher.


Again, what would be realistic choices for fixing this?  (It's possible that 
there aren't any and that privacy considerations would merely need to 
document an inherent and unfixable exposure.  In terms of guidance to writers 
of privacy considerations, that's ok, but it's worth making this point clear.)


Per the above, this was meant to provide a review of how the architecture was 
conceived of at the time it was designed.

...

  Privacy concerns about presence information largely arise due to the
  built-in mediation of the presence architecture.  The need for a
  presence server is motivated by two primary design requirements of
  presence: in the first place, the server can respond with an
  "offline" indication when the user is not online; in the second
  place, the server can compose presence information published by
  different devices under the user's control.  Additionally, to



Cooper, et al.           Expires August 27, 2013               [Page 29]


Internet-Draft           Privacy Considerations            February 2013


  preserve the use of URIs as identifiers for entities, some service


"preserve"?


Fixed.

  must operate a host with the domain name appearing in a presence URI,
  and in practical terms no commercial presence architecture would
  force end users to own and operate their own domain names.  Many end
  users of applications like presence are behind NATs or firewalls, and
  effectively cannot receive direct connections from the Internet - the
  persistent bidirectional channel these clients open and maintain with
  a presence server is essential to the operation of the protocol.


So?  I'm not understanding what makes this a privacy issue.


This is explaining why the mediated model was chosen.

  One must first ask if the trade-off of mediation for presence is
  worth it.  Does a server need to be in the middle of all publications


  worth it -> worthwhile.


Fixed.

  of presence information?  It might seem that end-to-end encryption of
  the presence information could solve many of these problems.  A


Not as described:  You'd still have mediation.  That is, the solution you 
offer does not answer the question you ask.

I think you mean to ask whether the intermediary needs to see all presence 
information in the clear.  If you really intend to suggest that an 
intermediary isn't needed, then you need to describe a scenario without one.


I think mediation is understood broadly here -- not as the question of whether 
an intermediary exists, but whether it actually mediates all aspects of the 
interaction.

  presentity could encrypt the presence information with the public key
  of a watcher, and only then send the presence information through the
  server.  The IETF defined an object format for presence information
  called the Presence Information Data Format (PIDF), which for the
  purposes of conveying location information was extended to the PIDF
  Location Object (PIDF-LO) - these XML objects were designed to
  accommodate an encrypted wrapper.  Encrypting this data would have
  the added benefit of preventing stored cleartext presence information
  from being seized by an attacker who manages to compromise a presence
  server.  This proposal, however, quickly runs into usability
  problems.  Discovering the public keys of watchers is the first
  difficulty, one that few Internet protocols have addressed
  successfully.  This solution would then require the presentity to
  publish one encrypted copy of its presence information per authorized
  watcher to the presence service, regardless of whether or not a
  watcher is actively seeking presence information - for a presentity
  with many watchers, this may place an unacceptable burden on the
  presence server, especially given the dynamism of presence
  information.  Finally, it prevents the server from composing presence
  information reported by multiple devices under the same user's
  control.  On the whole, these difficulties render object encryption
  of presence information a doubtful prospect.

  Some protocols that provide presence information, such as SIP, can


hmmm.  I didn't think that SIP, itself, provided presence information...?  
SIMPLE uses SIP, but it isn't SIP doing the presence work.


Fixed.

  operate intermediaries in a redirecting mode, rather than a
  publishing or proxying mode.  Instead of sending presence information
  through the server, in other words, these protocols can merely
  redirect watchers to the presentity, and then presence information
  could pass directly and securely from the presentity to the watcher.
  It is worth noting that this would disclose the IP address of the
  presentity to the watcher, which has its own set of risks.  In that
  case, the presentity can decide exactly what information it would
  like to share with the watcher in question, it can authenticate the
  watcher itself with whatever strength of credential it chooses, and
  with end-to-end encryption it can reduce the likelihood of any



Cooper, et al.           Expires August 27, 2013               [Page 30]


Internet-Draft           Privacy Considerations            February 2013


  eavesdropping.  In a redirection architecture, a presence server
  could still provide the necessary "offline" indication, without
  requiring the presence server to observe and forward all information
  itself.  This mechanism is more promising than encryption, but also
  suffers from significant difficulties.  It too does not provide for
  composition of presence information from multiple devices - it in
  fact forces the watcher to perform this composition itself.  The
  largest single impediment to this approach is however the difficulty
  of creating end-to-end connections between the presentity's device(s)
  and a watcher, as some or all of these endpoints may be behind NATs
  or firewalls that prevent peer-to-peer connections.  While there are
  potential solutions for this problem, like STUN and TURN, they add
  complexity to the overall system.


Given the pragmatics, I'm surprised you'd call this 'promising'.


It's phrased relatively.


  Consequently, mediation is a difficult feature of the presence
  architecture to remove, and due especially to the requirement for
  composition it is hard to minimize the data shared with
  intermediaries.  Control over sharing with intermediaries must
  therefore come from some other explicit component of the
  architecture.  As such, the presence work in the IETF focused on
  improving the user participation over the activities of the presence
  server.  This work began in the GEOPRIV working group, with controls
  on location privacy, as location of users is perceived as having
  especially sensitive properties.  With the aim to meet the privacy
  requirements defined in [RFC2779] a set of usage indications, such as
  whether retransmission is allowed or when the retention period
  expires, have been added to PIDF-LO that always travel with location
  information itself.  These privacy preferences apply not only to the
  intermediaries that store and forward presence information, but also
  to the watchers who consume it.

  This approach very much follows the spirit of Creative Commons [CC],
  namely the usage of a limited number of conditions (such as 'Share
  Alike' [CC-SA]).  Unlike Creative Commons, the GEOPRIV working group
  did not, however, initiate work to produce legal language nor to
  design graphical icons since this would fall outside the scope of the


hmmm.  This raises a possible issue with finding and liaising with other 
groups relevant to privacy and with complementary skills.  So, for example, 
here's a case of needing work to aid privacy that was identified but needed 
to be handed off to another group.

Lining up such contacts ahead of time could be a useful bit of work for a 
privacy directorate?


Perhaps, it would depend on the nature of the use cases envisioned for the 
particular privacy preference expressions being built into protocols.

Thanks,
Alissa




d/


-- 
Dave Crocker
Brandenburg InternetWorking
bbiw.net

Re: [IAB] Call for Comment: 'Privacy Considerations for Internet Protocols'