ietf
[Top] [All Lists]

Re: [P2PSIP] Last Call: <draft-ietf-p2psip-base-24.txt> (REsource LOcation And Discovery (RELOAD) Base Protocol) to Proposed Standard

2013-02-20 10:37:11
Hi again,

[this time with all comments, including chapters 7 and 10, no changes
 in the first part so you can jump to [**CONT**] if you are looking for
 the new stuff]

Sorry for the late comments, but I re-read the whole draft which took
some time. There are still some clarifications
necessary as well as some inconsistencies left.
Unfortunately, my comments from June, 19th were not addressed yet.
https://www.ietf.org/mail-archive/web/p2psip/current/msg06229.html

Polina Goltsman also found many issues listed below while working on an
implementation.  The list is long, but the good news is that it's
mainly clarifications that are missing.

Major issues:

1) sec. 6.3.2:
length field in Forwarding Header covers what exactly?  it is not
really clear whether the length field counts the whole message or in
case of fragmentation only the fragment size. When reading Sec 6.7 is
seems that the former is meant, but the definition could and should be
clearer. Similarly, sec. 6.7 should be clear about this, e.g.,
describing that all Forwarding Headers are identical for fragments of
the same message with exception of the fragment field.

2) sec. 7.4.1.1: replica number handling is unclear
   it is unclear how replica numbers are incremented (e.g., per
   peer) and what receiving peers should actually do with this number.
   Is it important to store the value or would a boolean be sufficient
   so that the peer knows that it's a replica?

3) sec. 6.5.2: transport protocol set for AppAttach
   Applications may require use of other transport protocols than
   those defined in OverlayLinkType (TLS/DTLS, what about plain UDP,
SCTP, DCCP, etc.),
   but currently, this seems to be not possible (do the considerations
of sec. 6.5.1.6
   apply here?).

4) sec. 6.3.4:
   How can a different signature algorithm be used if not all
implementations
   support it? There is no possibility to provide the feedback, that the
   signature algorithm is not acceptable at a particular node.

5) sec. 10.5: handling of parallel JOIN requests and use of peer_ready
   it is not clear what should happen if two JNs try to join at the same
   time at the same AP (can they be processed in parallel or should they
   be processed in sequence).
   Furthermore, when MUST/SHOULD a JN send peer_ready Update - in step 9?


Minor issues:
citations are put first, followed by comments starting with #

sec. 1.1
========
old:
   storage rather than for bulk storage of large objects.  Records are
   stored under numeric addresses which occupy the same space as node
new:
   storage rather than for bulk storage of large objects.  Records are
   stored under numeric addresses, called Resource-IDs, which occupy
   the same space as node

# Resource-IDs are used in 1.2.2 but not introduced

sec. 1.2
========
   Message Transport:  Handles end-to-end reliability, manages request
   ...

# What are the interactions with Forwarding and Link Management and
  the Topology Plugin

old:
   Forwarding and Link Management Layer:  Stores and implements the
      routing table by providing packet forwarding services between
      nodes.  It also handles establishing new links between nodes,
      including setting up connections across NATs using ICE.

# It may be confusing to say "routing table" here, since this
# is usually associated with the topology plugin. So IMHO
# it is the Connection Table, not the routing table.
# Furthermore, I propose the following change:
old:
      including setting up connections across NATs using ICE.
new:
      including setting up connections for overlay links across
      NATs using ICE.

old:
      directly between nodes.  TLS [RFC5246] and DTLS [RFC6347] are the
      currently defined "link layer" protocols used by RELOAD for hop-
new:
      currently defined "overlay link layer" protocols used by RELOAD
for hop-

# avoids confusion with the classic ISO/OSI link layer (layer 2)

old:
   In addition to the above components, nodes communicate with a central
new:
   In addition to the above components, nodes may communicate with a central

# while it may be the default case, it is not strictly required

sec. 1.3
========
old:
   RELOAD also provides an optional shared secret based admission
   control feature using shared secrets and TLS-PSK.  In order to form a
new:
   RELOAD also provides an optional shared secret based admission
   control feature using shared secrets and TLS-PSK/TLS-SRP.  In order to

# TLS-SRP should be mentioned here, too

sec. 2
======
old:
   Terms used in this document are defined inline when used and are also
new:
   Terms in this document are defined inline when used and are also

# avoid double "used"

old:
   Bootstrap Node:  A network node used by Joining Nodes to help locate
      the Admitting Peer.
new:
   Bootstrap Node:  A network node used by Joining Nodes to help accessing
      the overlay by forwarding messages to peers.

# The bootstrap node does not locate the Admitting Peer, but the JN locates
# the AP by routing a message to its own Resource-ID

old:
   Connection Table:  The set of nodes to which a node is directly
      connected, which include nodes that are not yet available for
      routing.
new:
   Connection Table:  Contains connection information for the set of
      nodes to which a node is directly connected, which include nodes
      that are not yet available for routing.

# it is a data structure ...

   Node-ID:  A value of fixed but configurable length that uniquely
      identifies a node.  Node-IDs of all 0s and all 1s are reserved and
      are invalid Node-IDs.  A value of zero is not used in the wire
      protocol but can be used to indicate an invalid node in
      implementations and APIs.  The Node-ID of all 1s is used on the
      wire protocol as a wildcard.

# what means "invalid" exactly? A wildcard can be used at least on
# the wire so it is not invalid being used as destination Node-IDs.
# So the above definition is slightly contradictory.

old:
   Responsible Peer:  The peer that is responsible for a specific
      resource, as defined by the plugin algorithm.
new:
   Responsible Peer:  The peer that is responsible for a specific
      resource, as defined by the topology plugin algorithm.

old:
   Usage:  An usage is the definition of a set of data structures (data
      Kinds) that an application wants to store in the overlay.  An
      usage may also define a set of network protocols (application IDs)
new:
   Usage:  A usage is the definition of a set of data structures (data
      Kinds) that an application wants to store in the overlay.  A
      usage may also define a set of network protocols (application IDs)

# Typo: 2x An usage -> A usage

sec. 3.1
========
old:
   o  To determine its position in the overlay topology (if the overlay
      is structured; topology plugins do not need to be structured).
new:
   o  To determine its position in the overlay topology (if the overlay
      is structured; overlays do not need to be structured).

# a structured topology plugin is not the same as a structured overlay

   o  To determine the set of resources for which the node is
      responsible.

# this isn't necessarily true for unstructured overlays?

old:
   The general principle here is that the security mechanisms (TLS at
new:
   The general principle here is that the security mechanisms ((D)TLS at

sec. 3.2
========
   entity.  From the perspective of a peer, a client is a node that has
   connected to the overlay, but has not yet taken steps to insert

# an additional reference to the Connection Table would be nice

sec. 3.2.1
==========
      clients that choose this option need to process Update messages

# Update messages are not introduced yet, so a forward reference to
# section 6.4.2.3 would be helpful

      performing an Attach.  A client wishing to connect using this
      mechanism with a certificate with multiple Node-IDs can use a Ping
      (Section 6.5.3) to probe the Node-ID of the node to which it is
      connected before doing the Attach (Section 6.5.1).

# At this point it is not really clear why this step is necessary.
# Certificates with multiple Node-IDs were not explained yet.
# Furthermore, the reference to Section 6.5.1 should be moved
# to the previous sentence.

sec. 3.3
========
old:
   This section will discuss the capabilities of RELOAD's routing layer,
new:
   This section discusses the capabilities of RELOAD's routing layer,

old:
   Resource-based routing:    RELOAD supports routing messages based
      solely on the name of the resource.  Such messages are delivered
new:
   Resource-based routing:    RELOAD supports routing messages based
      solely on the name of the resource or Resource-ID.  Such messages

old:
   Clients:    RELOAD supports requests from and to clients that do not
      participate in overlay routing, located via either of the
      mechanisms described above.
new:
   Clients:    RELOAD supports requests from and to clients that do not
      participate in overlay routing.

# the addition is not really necessary and may be confusing

old:
   Destination Lists:    While in principle it is possible to just
      inject a message into the overlay with a single Node-ID as the
new:
   Destination Lists:    While in principle it is possible to just
      inject a message into the overlay with a single Node-ID or
      a single Resource-ID as the

# Resource-ID is also a possible destination

old:
   The basic routing mechanism used by RELOAD is Symmetric Recursive.
new:
   The basic routing mode used by RELOAD is Symmetric Recursive Routing
(SRR,
   cf. Section 6.2)

# I would prefer to use the term "mode" (as on p. 28) and it should be
# consistent with the text in section 6.2

old:
   opaque ID X1 which maps internally to [A, B] (perhaps by being an
   encryption of [A, B] and forwards to Z with only X1 as the via list.
new:
   opaque ID X1 which maps internally to [A, B] (perhaps by being an
   encryption of [A, B]) and forwards to Z with only X1 as the via list.

# simple typo

old:
   RELOAD also supports a basic Iterative "routing" mode (where the
new:
   RELOAD also supports a basic Iterative routing mode (where the

old:
   Iterative "routing" is implemented using the RouteQuery method, which
new:
   Iterative routing is implemented using the RouteQuery method, which

old:
   requests this behavior.  Note that iterative "routing" is selected
new:
   requests this behavior.  Note that Iterative routing is selected

sec. 3.4
========

   pairs.  The result is a connection between A and B. At this point, A
   and B MAY send messages directly between themselves without going
   through other overlay peers.  In other words, A and B are on each
   other's connection tables.  They MAY then execute an Update process,

#  MAY is RFC 2119 terminology, which is defined in section 5
#  Besides Update process a Join process is also possible.

   order to support this case, some small number of "bootstrap nodes"
   typically need to be publicly accessible so that new peers can

# what "publicly accessible" means exactly is not defined
# typically need to be publicly accessible (i.e., not behind a NAT or
# firewall) ...

old:
   The second case is when a client connects to a peer at an arbitrary
   IP address, rather than to its responsible peer, as described in the
new:
   The second case is when a client connects to a peer at an arbitrary
   node-ID, rather than to its responsible peer, as described in the

# since responsible peer may depend on the overlay topology, node-ID
# seems to be a better fit here

sec. 3.5.2
==========
old:
   When a new peer wishes to join the Overlay Instance, it will need a
   Node-ID that it is allowed to use and a set of credentials which
new:
   When a new peer wishes to join the Overlay Instance, it needs a
   Node-ID that it is allowed to use and a set of credentials which

# not really sure about this change as non-native speaker

   match that Node-ID.  When an enrollment server is used, the Node-ID
   used is the Node-ID found in the certificate received from the

# the mode with self-signed certificates is missing and should be
# mentioned also here

   "bootstrap node".  Because this is the first connection the peer
   makes, these nodes will need public IP addresses so that they can be

# may also work if the bootstrap node is directly reachable, e.g.,
# in the same domain
# in this paragraph and the following paragraph "Once a peer"
# is used three times

   past adjacencies which have public IP address and attempt to use them

# inconsistent use of term "adjacencies"/adjacent within the document
# different meanings as follows:
#  1.) all directly connected nodes (i.e., all nodes in the Connection
Table), e.g., sec. 3.5.2  and 6.4.2.3
#  2.) all peers in the routing table
#  3.) adjacent according to the overlay topology, e.g. sec. 6.4.2.1

sec. 4
======
   limits on size, on the values which may be stored.  For many Kinds,
   the set may be restricted to a single value; some sets may be allowed

# what set?

sec. 4.1.2.
===========
old:
   responsibility if the responsible peer fail [Chord].
new:
   responsibility if the responsible peer fails [Chord].

sec. 6
======
old:
   messages.  We then describe the symmetric recursive routing model,
   which is RELOAD's default routing algorithm.  We then define the
new:
   messages.  We then describe the symmetric recursive routing mode,
   which is RELOAD's default routing mode.  We then define the

# IMHO the term mode fits best, the routing algorithm is defined
# within the topology plugin

sec. 6.1.
=========
old:
   peer SHOULD generate an appropriate error but local policy can
new:
   peer SHOULD generate an appropriate error message but local policy can

   Once the peer has determined that the message is correctly formatted

# what does "correctly formatted" mean exactly?

sec. 6.1.1.
===========
      this node so it MUST verify the signature as described in
      Section 7.1 and MUST pass it up to the upper layers.  "Upper
      layers" is used here to mean the components above the "Overlay
      Link Service Boundary" line in the figure in Section 1.2.

# this is somewhat confusing. The text describes what the
# Forwarding and link management component does, but what
# other components are meant here?

      state, e.g., by unpacking any opaque IDS.

# I think any is incorrect, since other IDs are
# not inserted by this node and so it cannot "unpack" those,
# but only its own opaque IDs

sec. 6.1.2.
===========
   the first entry on the destination list is in the peer's connection
   table, then it MUST forward the message to that peer directly.

# This is probably motivated by clients. A hint to this fact
# may help.

old:
   destination list, it would detect that I is a opaque ID, recover the
new:
   destination list, it would detect that I is an opaque ID, recover the

# Typo fix

old:
   called List Compression.  Possibilities for a opaque ID include a
new:
   called List Compression.  Possibilities for an opaque ID include a
# Typo fix

   An intermediate node receiving a request from another node MUST
   return a response to this request with a destination list equal to
   the concatenation of the Node-ID of the node that sent the request
   with the via list in the request.  The intermediate node normally

# 1.) unclear why an _intermediate_ peer should return a response
#     (if it is not the destination node),
# 2.) the via list must be reversed before concatenating the Node-ID
# so:
old:
   with the via list in the request.  The intermediate node normally
new:
   with the reversed via list in the request.  The intermediate node
normally

sec. 6.1.3.
===========
old:
   compressed via list), the peer MUST replace that entry with the
   original via list that it replaced and then re-examine the
new:
   compressed via list), the peer MUST replace that entry with the
   reversed original via list that it replaced and then re-examine the

# the via list must be reversed for responses...

sec. 6.2.
=========

old:
   This Section defines RELOAD's Symmetric Recursive Routing (SRR)
   algorithm, which is the default algorithm used by nodes to route
   messages through the overlay.  All implementations MUST implement
   this routing algorithm.  An overlay MAY be configured to use
   alternative routing algorithms, and alternative routing algorithms
   MAY be selected on a per-message basis.  I.e., a node in an overlay
   which supports SRR and some other routing algorithm called XXX might
   use SRR some of the time and XXX some of the time.
new:
   This Section defines RELOAD's Symmetric Recursive Routing (SRR)
   mode, which is the default mode used by nodes to route
   messages through the overlay.  All implementations MUST implement
   this routing mode.  An overlay MAY be configured to use
   alternative routing modes, and alternative routing modes
   MAY be selected on a per-message basis.  I.e., a node in an overlay
   which supports SRR and some other routing mode called XXX might
   use SRR some of the time and XXX some of the time.

# better use mode for consistency (algorithm is contained in the
# topology plugin

sec. 6.2.1.
===========
old:
   node MAY also construct a more complicated destination list for
   source routing.
new:
   node MAY also construct a more complicated destination list for
   (loose) source routing.

   Once the message is constructed, the node sends the message to some
   adjacent peer.  If the first entry on the destination list is
   directly connected, then the message MUST be routed down that
   connection.  Otherwise, the topology plugin MUST be consulted to
   determine the appropriate next hop.

#  adjacent peer should be adjacent node, since it may be a client
#  directly connected means: a valid entry in the Connection Table
#  exists? this should be mentioned here...

sec. 6.2.2.
===========
# a hint that the same Transaction-ID as in the request MUST be used
# could be added.

sec. 6.3.1.1.
=============
   Unless a given structure that uses a select explicitly allows for
   unknown types in the select, any unknown type SHOULD be treated as an

# How is that allowance specified in the specification? Is it by the comment
# /* This structure can be extended */

sec. 6.3.2.
===========
      receive message with a TTL greater than the current value of
      initial-ttl (or the 100 default) MUST discard the message and send
      an "Error_TTL_Exceeded" error.

# what if the initial-ttl is larger than 100 and the TTL is >100 but
# < initial-ttl? The condition "or the 100 default" holds

old:
      used to indicate the fragment offset; see Section 6.7.
new:
      used to indicate the fragment offset in bytes; see Section 6.7.

old:
   length:  The count in bytes of the size of the message, including the
      header.
new:
   length:  The count in bytes of the size of the whole unfragmented
message,
   including the header.

# as already mentioned in the beginning this should be more precise


      destinations which the message should pass through.  The
      destination list is constructed by the message originator.  The

# is it allowed that intermediate peers add destinations? if not, please
# state so

old:
      next.  The list shrinks as the message traverses each listed peer.
new:
      next.  The list may shrink as the message traverses each listed peer.

# it need not be always the case that the list shrinks with each
traversed peer

sec. 6.3.2.2.
=============
old:
   structure with a DestinationType of opaque_id_type and a opaque_id
new:
   structure with a DestinationType of opaque_id_type and an opaque_id

# typo fix

old:
   opaque
      A compressed list of Node-IDs and an eventual Resource-ID.
      Because this value was compressed by one of the peers, it is only
      meaningful to that peer and cannot be decoded by other peers.
      Thus, it is represented as an opaque string.

   resource
      The Resource-ID of the resource which is desired.  This type MUST
      only appear in the final location of a destination list and MUST
      NOT appear in a via list.  It is meaningless to try to route
      through a resource.
new:
   resource
      The Resource-ID of the resource which is desired.  This type MUST
      only appear in the final location of a destination list and MUST
      NOT appear in a via list.  It is meaningless to try to route
      through a resource.

   opaque_id_type
      A compressed list of Node-IDs and an eventual Resource-ID.
      Because this value was compressed by one of the peers, it is only
      meaningful to that peer and cannot be decoded by other peers.
      Thus, it is represented as an opaque string.

# 1.) match the order in the select
# 2.) it must be opaque_id_type, not opaque

sec. 6.3.2.3
============
   flags
      Three flags are defined FORWARD_CRITICAL(0x01),
      DESTINATION_CRITICAL(0x02), and RESPONSE_COPY(0x04).  These flags
      MUST NOT be set in a response.  If the FORWARD_CRITICAL flag is

# What is the correct reaction if these flags are set in a response?
# (returning an Error_Invalid_Message or ignore?)

sec. 6.3.3.1
============
old:
   A node processing a request MUST return its status in the
   message_code field.  If the request was a success, then the message
new:
   A node processing a request MUST return its status in the
   message_code field of a response.  If the request was a success, then
the message

# clarification?

   Error_Request_Timeout:  A response to the request has not been
      received in a suitable amount of time.  The requesting node MAY
      resend the request at a later time.

# not clear when this will ever be used. Which node should send
# this error message?

   All RELOAD messages MUST be signed.  Intermediate nodes do not verify
   signatures.  Upon receipt (and fragment reassembly if needed) the
   destination node MUST verify the signature and the authorizing
   certificate.  If the signature fails, the implementation SHOULD
   simply drop the message and MUST NOT process it.  This check provides

# What happens if none {0,0} is given? Then no Node-ID is present...

sec. 6.4.2.
===========
What happens if these messages (like Join, Leave) are accidentally sent
to a Client?
Do the send an Invalid Message back?

   A new peer (but one that already has credentials) uses the JoinReq
   message to join the overlay.  The JoinReq is sent to the responsible
   peer depending on the routing mechanism described in the topology

Is the destination address now the Resource-ID or the Node-ID
of the "responsible peer"?

   Because joins may only be executed between nodes which are directly
   adjacent, receiving peers MUST verify that any JoinReq they receive

# 1.) should be peers rather than nodes
# 2.) directly adjacent means here: directly adjacent in the overlay
# topology (could otherwise be misunderstood as being directly
# connected)
# 3.) what must happen if the verification fails?

   adjacent, receiving peers MUST verify that any LeaveReq they receive
   arrives from a transport channel that is bound to the Node-ID to be

# what happens if that verification fails?

old:
   assumed by the leaving peer.)  This also prevents replay attacks
new:
   assumed by the leaving peer.  This also prevents replay attacks

# Typo fix

sec. 6.4.2.3
============
   the state change.  In general, peers send Update messages to all
   their adjacencies whenever they detect a topology shift.

# A hint to the Connection Table and Clients would clarify

sec. 6.4.2.4.
=============
old:
   X. A RouteQuery can also request that the receiving peer initiate an
new:
   X. A RouteQuery can also request that the receiving peer initiates an

old:
   One important use of the RouteQuery request is to support iterative
   routing.  The sender selects one of the peers in its routing table

# add reference to Section 3.3.

sec. 6.4.2.4.1
==============
   destination
      The destination which the requester is interested in.  This may be
      any valid destination object, including a Node-ID, opaque ID, or
      Resource-ID.

# Does opaque ID make sense here?

sec. 6.5.1
==========
   A node sends an Attach request when it wishes to establish a direct
   TCP or UDP connection to another node for the purpose of sending

# TCP/TLS or DTLS?

old:
   node A has Attached to node B, but not received any Updates from B,
new:
   node A has attached to node B, but not received any Updates from B,

# Typo

   channel but MUST NOT route messages through B to other peers via that
   channel.  The process of Attaching is separate from the process of
# Is that also true for clients?

sec. 6.5.1.1
============
old:
        } AttachReqAns;

   The values contained in AttachReqAns are:

new:
        } AttachReq;

   The values contained in AttachReq are:

old:
      A single AttachReqAns MUST NOT include both candidates whose
new:
      A single AttachReq MUST NOT include both candidates whose

# consistency!

sec. 6.5.1.2.
=============
old:
6.5.1.2.  Response Definition

new:
6.5.1.2.  Response Definition

The AttachAns message hast the same format as the AttachReq message.


#s/AttachReqAns/AttachAns/ in the whole paragraph.

sec. 6.5.1.3.
=============

   An agent follows the ICE specification as described in [RFC5245] with

# agent was not defined in the RELOAD context so far.

sec. 6.5.4.2.
=============
old:
   o  The configuration document is correctly digitally signed (see
      Section 11 for details on signatures.
new:
   o  The configuration document is correctly digitally signed (see
      Section 11 for details on signatures).

old:
   one listed in the current configuration file).  Details on kind-
   signer field in the configuration file is described in Section 11.1.
new:
   one listed in the current configuration file).  Details on kind-
   signer field in the configuration file are described in Section 11.1.

# typos

sec. 6.6.2
==========
old:
   Each connection has it own sequence number space.  Initially the
new:
   Each connection and direction has it own sequence number space.
Initially the

sec. 6.6.3.1
============
   A node MUST NOT have more than one unacknowledged message on the DTLS
   connection at a time.  Note that because retransmissions of the same
   message are given new sequence numbers, there may be multiple
   unacknowledged sequence numbers in use.

# Since retransmissions violate the first sentence, it may be better
# to use:
old:
   A node MUST NOT have more than one unacknowledged message on the DTLS
   connection at a time.  Note that because retransmissions of the same
new:
   A node MUST NOT have more than one unacknowledged message on the DTLS
   connection at a time, except for retransmissions. Note that because

   from the routing table.  The link MAY be restored to the routing
   table if ACKs resume before the connection is closed, as described

# this should read Connection Table twice?

sec. 6.7
=========
   and fragments it, each fragment has a full copy of the Forwarding

# 1.) clarification would be good that the length field is covering the
# total msg length and which field are different in the "copies"
# (at least the fragment field)
# 2.) what happens if overlapping fragments are received?

[**CONT**]

sec. 7
======
      data may be stored in a single transaction, rather than querying
      for the value of a counter before the actual store.

# a reader may wonder which counter is meant here - slightly confusing.
# I assume that the text assumes a hypothetical different design that
# uses counters as an alternative to timestamps?

      If a node attempting to store new data in response to a user
      request (rather than as an overlay maintenance operation such as
      occurs when healing the overlay from a partition) is rejected with
      an Error_Data_Too_Old error, the node MAY elect to perform its
      store using a storage_time that increments the value used with the
      previous store.

# I don't understand what "using a storage_time that increments the
# value used with the previous store" actually means. In this case
# it is assumed that the requesting node has already the storage_time
# of the previous store available or must it send a StatReq first?
# In the former case it could simply check if localtime > storage_time
# before sending the request. By what amount should the value be
incremented?


sec. 7.4
========
old:
   o  Store values in the overlay
   o  Fetch values from the overlay
   o  Stat:  get metadata about values in the overlay
   o  Find the values stored at an individual peer
new:
   o  Store: store values in the overlay
   o  Fetch: get values from the overlay
   o  Stat: get metadata about values in the overlay
   o  Find: get values stored at an individual peer

# just consistency

sec. 7.4.1.1
============
   which represents a sequence of stored values for a given Kind.  The
   same Kind-ID MUST NOT be used twice in a given store request.  Each

# what is the proper reaction if this condition is violated by a sending
node?

   replica_number
      The number of this replica.  When a storing peer saves replicas to
      other peers each peer is assigned a replica number starting from 1
      and sent in the Store message.  This field is set to 0 when a node
      is storing its own data.  This allows peers to distinguish replica
      writes from original writes.

# must it be "When a storing (responsible) peer saves replicas to"?
# IMHO this description is too vague, since
# as stated in 2) at the beginning of this mail it is not clear
# how to assign the replica numbers when storing replicas.
# Let's assume responsible peer X stores replicas at neighbors
# A and C: gets A replica number 1 and C replica number 2?
# if node B replaces neighbor C, does B get replica number 3 or number 2?
# if the responsible peer X is replaced by a new node Y, Y will get
# get the data from peer X, but how should X change its replica number?
# Can Y simply start sending replicas beginning at 1?
# I couldn't find any place in the document where it would make a
# difference between replica number 2 and replica number 42

old:
   kind
      The Kind-ID.  Implementations MUST reject requests corresponding
      to unknown Kinds.
new:
   kind
      The Kind-ID.  Implementations MUST reject (Error_Unknown_Kind)
      requests corresponding to unknown Kinds.

# hint on the correct error message for this check

old:
   values
      The value or values to be stored.  This may contain one or more
      stored_data values depending on the data model associated with
new:
   values
      The value or values to be stored.  This may contain one or more
      StoredData values depending on the data model associated with

# consistency

old:
   o  The signatures over each individual data element (if any) are
new:
   o  The signatures over each individual StoredData element (if any) are
# consistency

   o  For original (non-replica) stores, the peer MUST check that if the
      generation counter is non-zero, it equals the current value of the
      generation counter for this Kind.  This feature allows the
      generation counter to be used in a way similar to the HTTP Etag
      feature.

# what is the proper reaction in case the check fails? A reference to the
# mentioned HTTP Etag feature would also be nice.

   o  The storage time values are greater than that of any value which
      would be replaced by this Store.
   o  The size and number of the stored values is consistent with the
      limits specified in the overlay configuration.

# what is the reaction in case the checks fail?

   If all these checks succeed, the peer MUST attempt to store the data

# this seems to suggest that the list is complete, though more checks are
# describe in the following section 7.4.1.2. Maybe it's more consistent
# to have them described in one place.

sec.7.4.1.2.
=============

   replicas
      The list of other peers at which the data was/will be replicated.
      In overlays and applications where the responsible peer is
      intended to store redundant copies, this allows the storing node
      to independently verify that the replicas have in fact been
      stored.  It does this verification by using the Stat method (see
      Section 7.4.3).  Note that the storing node is not required to
      perform this verification.

# what is meant by the term "storing node"? Is this the responsible node
# who actually stores the data or is it the node who originated the
# StoreReq? This may be inconsistent with the rest of the document.
# Moreover, it is not clear that this implies a certain order. Must
# the responsible node having successfully finished storing the replicas
# before returning the replicas?

   If any type of request tries to access a data Kind that the peer does
   not know about, an Error_Unknown_Kind MUST be generated.  The

# The reaction is defined differently for StoreReq, FetchReq, StatReq
# and FindReq:
# - StoreReq: send back Error_Unknown_Kind
# - FetchReq: "Implementations SHOULD reject requests corresponding to
#              unknown Kinds unless specifically configured otherwise."
#   -> what means "reject" exactly?
# - StatReq: no hint given how to behave
# - FindReq: "If a Kind-ID is not known, then the corresponding
Resource-ID MUST be
   0." -> Cannot distinguish from the case where a Resource-ID is not known:
#      "The closest Resource-ID to the specified Resource-ID.  This is 0
#       if no Resource-ID is known."
#
# Is the intention to trigger Configuration update requests?
# cf. StoreReq: A node which
#   receives this error MUST generate a ConfigUpdate message which
#   contains the appropriate Kind definition (assuming that in fact a
#   Kind was used which was defined in the configuration document).

sec. 7.4.1.3
=============

   remove a value, the owner stores a new DataValue with "exists" set to
   False:

      exists = False
      value = {} (0 length)

# what happens if a value is given?
# Ignore the value or reject with an error invalid message?

sec. 7.4.2.2.
=============
old:
           uint64                 generation;
new:
           uint64                 generation_counter;
and
old:
   generation
      the generation counter for this Kind.
new:
   generation_counter
      the generation counter for this Kind.

# consistency

sec. 7.4.4.1:
=============

   kinds
      The desired Kind-IDs.  Each value MUST only appear once, and if
      not the request MUST be rejected with an error.

# What is the error code?

sec. 7.4.4.2:
=============
   The response is simply a series of FindKindData elements, one per
   Kind, concatenated end-to-end.  The contents of each element are:

# unclear what end-to-end means here?

sec. 10
=======
   protocol.  A short list of differences:

# A further difference is the fact that CHORD uses hashes of IP addresses
# while in RELOAD nodes get assigned Node-IDs and the routing table
# does not contain "underlay addresses"

sec. 10.1
=========
old:
   and n+1, with all arithmetic being done modulo 2^{k}, where k is the
   length of the Node-ID in bits, so that node 2^{k} - 1 is directly
new:
   and n+1, with all arithmetic being done modulo 2^(k), where k is the
   length of the Node-ID in bits, so that node 2^(k) - 1 is directly

# consistency with other formulas...

sec. 10.3
=========
   If a peer is not responsible for a Resource-ID k, but is directly
   connected to a node with Node-ID k, then it MUST route the message to

# what means "directly connected" here? Is it an entry in the
ConnectionTable
# or RoutingTable? I guess the former in order to cover routing to Clients

sec. 10.5
=========
   1.  JN MUST connect to its chosen bootstrap node.

# it is unclear what "connect" exactly means. This is actually
# described in Sec. 11.4:
#  "When contacting a bootstrap node, the joining node MUST first form
#   the DTLS or TLS connection to the bootstrap node and then sends an
#   Attach request over this connection with the destination Node-ID set
#   to the joining node's Node-ID."
# So my suggestion is to explain it a little but more or to put a forward
# reference to section 11.4.

   2.  JN SHOULD send an Attach request to the admitting peer (AP) for
       Node-ID n.  The "send_update" flag can be used to acquire the
       routing table for AP.

# This is inconsistent with Section 12 stating
# "JN then sends an Attach through that peer to a resource ID of itself
(JN)."
# (Node ID n vs. Resource-ID n)
#
# Moreover, it is not clear how the JN finds the AP. This is explained in
#Section 12:
#   to one of the bootstrap nodes.  JN then sends an Attach through that
#   peer to a resource ID of itself (JN).  It gets routed to the
#   admitting peer (AP) because JN is not yet part of the overlay.  When
# I suggest to change it to:
   2.  JN SHOULD send an Attach request to Resource-ID of itself (JN) in
       order to contact the Admitting peer (AP) for Node-ID n.
       The "send_update" flag can be used to acquire the
       routing table from AP.
# Question: should it be "acquire the routing table of AP" or "acquire the
# routing table from AP"?

   3.  JN SHOULD send Attach requests to initiate connections to each of
       the peers in the neighbor table as well as to the desired finger
       table entries.  Note that this does not populate their routing
       tables, but only their connection tables, so JN will not get
       messages that it is expected to route to other nodes.

# here it is unclear that the Attach requests are sent via the AP
# and what "desired" finger table entries means, e.g., in contrast to all?

# why are we having two SHOULDs instead of two MUSTs, since 10.7 states:
#    A peer MUST maintain an association (via Attach) to every member of
#   its neighbor set.  A peer MUST attempt to maintain at least three


   In order to set up its i'th finger table entry, JN MUST send an
   Attach to peer n+2^(128-i).  This will be routed to a peer in

# instead of peer n+2^(128-i) better resource-ID n+2^(128-i)?

# How should an AP react if it detects that a JN sent
# the Join to the wrong AP (not responsible at all or not
# anymore)?

sec. 10.7.
==========
   A peer MUST maintain an association (via Attach) to every member of
   its neighbor set.  A peer MUST attempt to maintain at least three

# why only to the neighbor set and not the whole routing table?

Sec. 10.7.1:
============
   Every time a connection to a peer in the neighbor table is lost (as
   determined by connectivity pings or the failure of some request), the

# I think that "connectivity pings" means periodically sent Pings to all
# members of the Connection Table by the Link Management Layer as
# mentioned in sec. 6.5. A reference may help here.
# However, I couldn't find out how often such "connectivity pings" should
# be sent (the chord-ping-interval is only related to pinging finger table
# entries).

# It is unclear to me why (sec. 10.7.1) the spec states:
#  "peer MUST remove the entry from its neighbor table and replace it
#   with the best match it has from the other peers in its routing table.
#   If using reactive recovery, it then sends an immediate Update to all
#   nodes in its Neighbor Table."
#  so the peer replaces the failed entry with one of its fingers,
#  which is maybe a poor choice, and sends this information to
#  its neighbors. IMHO it makes much more sense to request routing
#  information from the neighbors in order to get a better replacement
#  for the lost peer, instead of "confusing" the neighbors with more
#  inaccurate routing information.

   If the neighbor failure affects the peer's range of responsible IDs,
   then the Update MUST be sent to all nodes in its Connection Table.

# a short hint that it's the Connection Table instead of the Neighbor Table
# due to Clients would be nice

   If connectivity is lost to all successor peers in the neighbor table,
   then this peer SHOULD behave as if it is joining the network and MUST
   use Pings to find a peer and send it a Join.  If connectivity is lost

# Only Join or Attach and Join?

Sec. 10.7.2.:
=============
   If a finger table entry is found to have failed, all references to

# how is it determined to have "failed"? 10.7.1 is only
# related to neighbor failures...does the same definition
# apply here?

Sec. 10.7.3.:
=============
   If the neighbor failure affects the peer's range of responsible IDs,
   then the Update MUST be sent to all nodes in its Connection Table.

# add hint that it's Connection Table due to Clients

   interval" element (denominated in seconds.)  A peer SHOULD randomly
   offset these Update requests so they do not occur all at once.

# This is a little bit too imprecise.
# How should this be done? A typical mechanism
# is using a randomly chosen offset from an interval [0.5*Tp,1.5*Tp]
# where Tp is the target period, cf.
# S. Floyd, V. Jacobson, „The Synchronization of Periodic Routing
# Messages“, IEEE/ACM Transactions on Networking, Vol. 2, No. 2, April,
# 1994, http://portal.acm.org/citation.cfm?id=187045

Sec: 10.7.4.2.:
==============
   A peer SHOULD NOT send Ping requests looking for new finger table
   entries more often than the configuration element "chord-ping-
   interval", which defaults to 3600 seconds (one per hour).

# This paragraph should probably moved some paragraphs down as it
# has been stated yet that a peer should actually send Ping requests
# at all (this is explained in the subsequent paragraphs).

Sec 12:
========
# throughout the figures: "Update" should be "UpdateReq"
# and Attach should be AttachReq?
# Otherwise it's not consistent with the use of Attach vs.
AttachReq/AttachAns

Neighbor Table and Connection Table are written sometimes
in lower case and sometimes in upper case style. I'm sure that
the RFC editor will ask for consistency in writing...

That's all for now...

Regards,
 Roland

<Prev in Thread] Current Thread [Next in Thread>