ietf
[Top] [All Lists]

Gen-ART Last Call review of draft-ietf-avt-rtp-vorbis-06

2007-10-26 14:32:31
Hi, Luca,

I have been selected as the General Area Review Team (Gen-ART)
reviewer for this draft (for background on Gen-ART, please see
http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).

Please resolve these comments along with any other Last Call comments
you may receive.

Thanks,

Spencer

Document: draft-ietf-avt-rtp-vorbis-06
Reviewer: Spencer Dawkins
Review Date: 26 Oct 2007 (sorry, late!)
IETF LC End Date:   22 Oct 2007
IESG Telechat date:

Summary: This document is close to ready for publication as a Proposed Standard. I have a small number of questions, mostly involving clarity or 2119 language. Please take a look at the question in 7.1, especially.

Comments: I have included "nits" in this review for the convenience of authors and editors later in the process. Nits are not part of a Gen-ART review.

1.  Introduction

  Vorbis is a general purpose perceptual audio codec intended to allow
  maximum encoder flexibility, thus allowing it to scale competitively
  over an exceptionally wide range of bitrates.  At the high quality/
  bitrate end of the scale (CD or DAT rate stereo, 16/24 bits), it is
  in the same league as AAC.  Vorbis is also intended for lower and

Spencer (nit): AAC? has not been expanded previously.

  higher sample rates (from 8kHz telephony to 192kHz digital masters)
  and a range of channel representations (monaural, polyphonic, stereo,
  quadraphonic, 5.1, ambisonic, or up to 255 discrete channels).

2.2.  Payload Header

  Vorbis Data Type (VDT): 2 bits

  This field specifies the kind of Vorbis data stored in this RTP
  packet.  There are currently three different types of Vorbis
  payloads.  Each packet MUST contain only a single type of Vorbis
  payload (e.g. you MUST not aggregate configuration and comment

Spencer: this is close to a nit, but 2119 language is important. This is just restating the previous MUST. I'd suggest either "must" in lower case or "MUST NOT" - if there's a reason to have two 2119 requirements that say the same thing.

  payload in the same packet)

     0 = Raw Vorbis payload
     1 = Vorbis Packed Configuration payload
     2 = Legacy Vorbis Comment payload
     3 = Reserved

  The packets with a VDT of value 3 MUST be ignored

  The last 4 bits represent the number of complete packets in this
  payload.  This provides for a maximum number of 15 Vorbis packets in
  the payload.  If the packet contains fragmented data the number of
  packets MUST be set to 0.

Spencer (nit): what type of fragmentation is this? Please provide an adjective :-)

2.3.  Payload Data

  The Vorbis packet length header is the length of the Vorbis data
  block only and does not count the length field.

Spencer (nit): s/count/include/, I think.

  The payload packing of the Vorbis data packets MUST follow the
  guidelines set-out in [3] where the oldest packet occurs immediately

Spencer: again, adjectives are good. This is saying "the oldest Vorbis packet", right? It would be better if the specification doesn't use language like "the oldest packet in the packet" with no adjectives - that doesn't take us to a good place.

  after the RTP packet header.  Subsequent packets, if any, MUST follow
  in temporal order.

Spencer: "Subsequent Vorbis packets", right? what does the receiver do if the "follow in temporal order" MUST is violated?

3.1.1.  Packed Configuration

  A Vorbis Packed Configuration is indicated with the Vorbis Data Type
  field set to 1.  Of the three headers defined in the Vorbis I
  specification [10], the identification and the setup MUST be packed
  as they are, while the comment header MAY be replaced with a dummy
  one.  The packed configuration follows a generic way to store xiph
  codec configurations: The first field stores the number of the
  following packets minus one (count field), the next ones represent
  the size of the headers (length fields), the headers immediately
  follow the list of length fields.  The size of the last header is
  implicit.  The count and the length fields are encoded using the
  following logic: the data is in network order, every byte has the

Spencer (nit): c/network order/network byte order/, and there are multiple occurrences in the document...

  most significant bit used as flag and the following 7 used to store
  the value.  The first N bit are to be taken, where N is number of
  bits representing the value modulo 7, and stored in the first byte.
  If there are more bits, the flag bit is set to 1 and the subsequent
  7bit are stored in the following byte, if there are remaining bits
  set the flag to 1 and the same procedure is repeated.  The ending
  byte has the flag bit set to 0.  In order to decode it is enough to
  iterate over the bytes until the flag bit set to 0, for every byte
  the data is added to the accumulated value multiplied by 128.  The
  headers are packed in the same order they are present in ogg:
  identification, comment, setup.

3.2.  Out of Band Transmission

  This section, as stated above, does not cover all the possible out-
  of-band delivery methods since they rely on different protocols and
  are linked to specific applications.  The following packet definition
  SHOULD be used in out-of-band delivery and MUST be used when

Spencer: is there an obvious reason to violate the SHOULD?

  Configuration is inlined in the SDP.

5.1.  Example Fragmented Vorbis Packet

  The Fragment type field is set to 2 and the number of packets field
  is set to 0.  For large Vorbis fragments there can be several of
  these type of payload packets.  The maximum packet size SHOULD be no

Spencer (nit): s/these type/this type/?

Spencer: why is this a SHOULD?

  greater than the path MTU, including all RTP and payload headers.
  The sequence number has been incremented by one but the timestamp
  field remains the same as the initial packet.

5.2.  Packet Loss

  As there is no error correction within the Vorbis stream, packet loss
  will result in a loss of signal.  Packet loss is more of an issue for
  fragmented Vorbis packets as the client will have to cope with the
  handling of the Fragment Type.  In case of loss of fragments the
  client MUST discard all the remaining fragments and decode the

Spencer (nit) - "remaining Vorbis fragments" and "incomplete Vorbis packet"?

  incomplete packet.  If we use the fragmented Vorbis packet example
  above and the first packet is lost the client MUST detect that the

Spencer (nit) - "and the first RTP packet is lost"?

  next packet has the packet count field set to 0 and the Fragment type
  2 and MUST drop it.  The next packet, which is the final fragmented
  packet, MUST be dropped in the same manner.  If the missing packet is
  the last, the received two fragments will be kept and the incomplete
  vorbis packet decoded.

6.  IANA Considerations

     configuration-uri:  the URI [4] of the configuration headers in
        case of out of band transmission.  In the form of
        "protocol://path/to/resource/", depending on the specific

Spencer: isn't this "scheme://path/to/resource"?

        method, a single configuration packet could be retrived by its
        Ident number, or multiple packets could be aggregated in a
        single stream.  Non hierarchical protocols MAY point to a
        resource using their specific syntax.

7.1.  Mapping Media Type Parameters into SDP

  The information carried in the Media Type media type specification
  has a specific mapping to fields in the Session Description Protocol
  (SDP) [5], which is commonly used to describe RTP sessions.  When SDP
  is used to specify sessions the mapping are as follows:

Spencer: is there anything you can say about Receiver behavior when the Media Type and SDP don't map correctly?

7.1.1.  SDP Example

  The following example shows a basic SDP single stream.  The first
  configuration packet is inlined in the sdp, other configurations

Spencer (nit): it would be great if SDP, URI etc. were consistently capitalized.

  could be fetched at any time from the first provided uri using or all

Spencer: is this "the URI currently in use"? but the sentence doesn't parse as written.

  the known configuration could be downloaded using the second uri.
  The inline base64 [9] configuration string is trimmed because of the

Spencer: is this "is folded in this example due to RFC line length limitations"?

  length.
     c=IN IP4 192.0.2.1
     m=audio RTP/AVP 98
     a=rtpmap:98 vorbis/44100/2
     a=fmtp:98 delivery-method=inline;
     configuration=AAAAAZ2f4g9NAh4aAXZvcmJpcwA...; delivery-
     method=out_band; configuration-uri=rtsp://path/to/the/resource;
     delivery-method=out_band;
     configuration-uri=http://another/path/to/resource/;

  Note that the payload format (encoding) names are commonly shown in
  upper case.  Media Type subtypes are commonly shown in lower case.
  These names are case-insensitive in both places.  Similarly,
  parameter names are case-insensitive both in Media Type types and in
  the default mapping to the SDP a=fmtp attribute.  The exception
  regarding case sensitivity is the configuration-uri URI which MUST be
  regarded as being case sensitive.  The a=fmtp line is a single line

Spencer: "shown as multiple lines in this document for clarity"?

  even if it is presented broken because of clarity.

7.2.  Usage with the SDP Offer/Answer Model

  The only paramenter negotiable is the delivery method.  All the

Spencer (nit): c/paramenter negotiable/negotiable parameter/

  others are declarative: the offer, as described in An Offer/Answer
  Model Session Description Protocol [8], may contain a large number of
  delivery methods per single fmtp attribute, the answerer MUST remove
  every delivery-method and configuration-uri not supported.  All the
  parameters MUST not be altered on answer otherwise.

8.  Congestion Control

  Vorbis clients SHOULD send regular receiver reports detailing

Spencer: is there a well-understood definition of "regular" within this community?

  congestion.  A mechanism for dynamically downgrading the stream,
  known as bitrate peeling, will allow for a graceful backing off of
  the stream bitrate.  This feature is not available at present so an
  alternative would be to redirect the client to a lower bitrate stream
  if one is available.

9.1.  Stream Radio

  This is one of the most common situation: one single server streaming
  content in multicast, the clients may start a session at random time.
  The content itself could be a mix of live stream, as the wj's voice,

Spencer (nit): "wj's"? please spell this out (I'm guessing what this means)

and stored streams as the music she plays.


_______________________________________________
Ietf mailing list
Ietf(_at_)ietf(_dot_)org
https://www1.ietf.org/mailman/listinfo/ietf

<Prev in Thread] Current Thread [Next in Thread>