ietf
[Top] [All Lists]

RE: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03

2014-10-30 10:06:33
Michael,

Thank you for the comprehensive response.  I'll summarize things here,
as opposed to adding to the extensive inline discussion.  At a high level,
this is progress, as I'm fine with the proposals for issues [A], [E], [G]
and all the nits.  That leaves major issues [B], [C] and [D], plus
minor issue [F].

Minor issue [F] was intended solely as a request to add citations - it
looks like a lot more was read into it beyond what I originally
intended.

--- Major Issues ---

-- [A] Encoding algorithm

The proposed action looks reasonable; I'll review the text when it appears.

-- [B] G.711.0 Frame Format

[B] The G.711.0 frame format is not specified here, making it very difficult
to figure out what's going on when G.711.0 frames are concatenated.  A
specific example is that the concept of a "prefix code" that occurs at the
start of a G.711.0 frame is far too important to be hidden in step H5 of the
decoding algorithm in Section 4.2.3.

We welcome comments on how to improve this section, as it is complicated. We
did attempt to describe only what is necessary for understanding.

The problem is not located in 4.2.3 - I think that the G.711.0 frame format
should have been explained earlier, e.g., so that the notion of "prefix code"
is not a surprise to the reader.  Please add an overview of the frame
format generated by the G.711.0 encoder, and assumed by the G.711.0 decoder
(including the "prefix code") somewhere in section 3.

-- [C] SDP ptime

[C] The discussion of use of the SDP ptime parameter is spread out and
imprecise (is SDP REQUIRED?, when is ptime REQUIRED, RECOMMENDED, or
recommended? - it's not obvious).

Proposed Action: The discussion of ptime (and the channels parameter) in
this section is primarily for the purpose of a check. If it is any comfort,
that paragraph has had lots of input to it previously (so you responded to a
complicated issue). And since we have no need to describe "ptime issues" or
session negotiation issues prior to this point (Section 4.2.4) in the document
AND ptime isn't a required negotiation parameter AND we put a forward
reference to Section 5.1 for  "ptime" when SDP is used, I hesitate to mention
such an optional parameter here in Section 3.
Proposed action: No Change (the forward references are enough).

That does not address the concern.  The concern is "Text on ptime is spread
out and imprecise" and in particular the implementation requirements
are unclear. 

The explanation below of why the text is spread out, imprecise and unclear
will not help make the draft clearer to other readers, sorry :-).  I'm not
convinced that this aspect of the draft is interoperably implementable as
currently written.

[D] Backwards compatibility.

The problem here is that it's not clear that negotiation (e.g., via SDP) is
required.  This sentence in Section 3.1 is a particular problem:

   G.711.0, being both lossless and stateless, may also be employed as a
   lossless compression mechanism anywhere between end systems which
   have negotiated use of G.711.

That's definitely wrong.  Use of G.711.0 when only G.711 has been negotiated
will fail to interoperate correctly.

The passage you quote is in Section 3  which is "General Information and Use
of ITU-T G.711.0 Codec) and is: 1) prior to ANY discussion of the use of
G.711.0 in RTP (or even packet networks), and 2) prior to any discussion of
media negotiation when using RTP (e.g., SDP). Thus the context for this
sentence is at the codec bit stream (or packet payload) level of the ITU-T
codec. It stands on its own and is definitely correct.

If this draft were an ITU-T document or otherwise clearly identified as an
IETF Annex to G.711.0, I could agree with that rationale.  However ...

... this is an IETF document, therefore that statement applies to use of SDP
for negotiation as discussed elsewhere in the draft; as applied to SDP, that
statement is and will result in interoperability problems.

There is also the broader concern of what sort of negotiation is required,
which is also unclear in the current draft text.

--- Minor Issues ---

-- [E] PT values 0 and 8
 
   The only significant difference is that the
   payload type (PT) RTP header field will have a value corresponding to
   the dynamic payload type assigned to the flow.  This is in contrast
   to most current uses of G.711 which typically use the static payload
   assignment of PT = 0 (PCMU) or PT = 8 (PCMA) [RFC3551] even though
   the negotiation and use of dynamic payload types is allowed for
   G.711.

I would change "will have" to "MUST have" and add the following sentence:

   The existing G.711 PT values of 0 and 8 MUST NOT be used for G.711.0
   content.

I suspect that this is obvious to the authors, but it'll help a reader who's
not familiar with the importance of the difference between G.711 and G.711.0 
.

Proposed Action: Happy to fix both (for the reasons given). However, please
read my reply to [F] below, I believe the rules actually allow PT = [0|8] in a
specific corner case (result is: MUST NOT->SHOULD NOT in your suggestion).

I understand the concern, but I would suggest an alternate text structure for
clarity:

        The existing G.711 PT values of 0 and 8 MUST NOT be used for G.711.0
      content except when ...

followed by an explanation of the corner case.  A general "SHOULD NOT" invites
use of those values in situations outside that corner case.

-- [F] RTP profiles

      PT - The assignment of an RTP payload type for the format defined
      in this memo is outside the scope of this document.  The RTP
      profiles in use currently mandate binding the payload type
      dynamically for this payload format.

Good start, but not sufficient - cite the "RTP profiles currently in use" and
I would expect those citations to be normative references.

Would that be just RFC 3551 and RFC 4585 (both are already normative
references), or are there more RTP profiles?

It looks like entirely too much may have been read into this concern.
Rephrasing in the hope of resolving this one quickly:

When I read " The RTP profiles in use currently mandate ...", I want to know
which RTP profiles those are.  I think that's a reasonable request - please
cite the RTP profiles that are the foundation for that statement or delete the
sentence.

-- [G] Framing errors

The proposed action looks reasonable; I'll review the text when it appears.

--- Nits ---

All of the proposed actions on the nits are fine with me.

Thanks,
--David

-----Original Message-----
From: Michael Ramalho (mramalho) [mailto:mramalho(_at_)cisco(_dot_)com]
Sent: Wednesday, October 29, 2014 11:36 AM
To: Black, David; Paul E. Jones (paulej(_at_)packetizer(_dot_)com);
harada(_dot_)noboru(_at_)lab(_dot_)ntt(_dot_)co(_dot_)jp; 
muthu(_dot_)arul(_at_)gmail(_dot_)com; lei(_dot_)miao(_at_)huawei(_dot_)com;
General Area Review Team (gen-art(_at_)ietf(_dot_)org); 
ops-dir(_at_)ietf(_dot_)org
Cc: ietf(_at_)ietf(_dot_)org; payload(_at_)ietf(_dot_)org
Subject: RE: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03

David,

The authors of the G.711.0 RTP Payload Draft thank you for the comments below.
It is clear from the caliber of your comments that you spent a lot of time on
this.

G.711.0 being a variable length stateless and lossless compression for G.711
(a sampled-oriented encoding) causes a lot of confusion to those who
occasionally think of it as "a codec" instead of the lossless compression
mechanism it is.

Thus, this was a hard payload format to write due to some of the pre-conceived
notions of what G.711.0 is and an even harder one for someone to review (as it
is not sample-based or fixed-length frame-based encoding that the authors of
RFC 3550/3511 assumed/envisioned).

So, I really do thank you for the effort here, David. You must have drawn the
short-straw.

My response to your comments/questions are made in-line below (my comments
with "\begin {Reply to [issue]}" and my proposed fixes within these are
highlighted with ">>").

Regards,

Michael A. Ramalho, Ph.D.

-----Original Message-----
From: Black, David [mailto:david(_dot_)black(_at_)emc(_dot_)com]
Sent: Wednesday, October 22, 2014 11:44 AM
To: Michael Ramalho (mramalho); Paul E. Jones 
(paulej(_at_)packetizer(_dot_)com);
harada(_dot_)noboru(_at_)lab(_dot_)ntt(_dot_)co(_dot_)jp; 
muthu(_dot_)arul(_at_)gmail(_dot_)com; lei(_dot_)miao(_at_)huawei(_dot_)com;
General Area Review Team (gen-art(_at_)ietf(_dot_)org); 
ops-dir(_at_)ietf(_dot_)org
Cc: ietf(_at_)ietf(_dot_)org; payload(_at_)ietf(_dot_)org; Black, David
Subject: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03

This is a combined Gen-ART and OPS-DIR review.  Boilerplate for both follows
...

I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART,
please see the FAQ at:

<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call comments you may
receive.

I have reviewed this document as part of the Operational directorate's ongoing
effort to review all IETF documents being processed by the IESG.  These
comments were written primarily for the benefit of the operational area
directors.
Document editors and WG chairs should treat these comments just like any other
last call comments.

Document: draft-ietf-payload-g7110-03
Reviewer: David Black
Review Date: October 22, 2014
IETF LC End Date: October 27, 2014
IESG Telechat date: October 30, 2014

Summary: This draft is on the right track, but has open issues
              described in the review.

Process note: This is the second draft that I've reviewed recently that has
been scheduled for an IESG telechat almost immediately following the end of
IETF Last Call.  The resulting overlap of IETF LC with IESG Evaluation can
result in significant last-minute changes to the draft when issues are
discovered during IETF LC.

This draft describes an RTP payload format for carrying G.711.0 compressed
G.711 voice.  The details of G.711.0 compression are left to the ITU-T G.711.0
spec (which is fine), and this draft focuses on how to carry the compressed
results in RTP and conversion to/from uncompressed G.711 voice at the
communication endpoints.
I found a few major issues and a couple of minor ones, although a couple of
the major issues depend on a meta-issue, - the intended relationship of this
draft be to the ITU-T G.711.0 spec.

In general, I expect IETF RFCs to be stand-alone documents that make sense on
their own, although one may need to read related documents to completely
understand what's going on.  For this draft, I would expect the actual
compression/decompression algorithms to be left to the ITU-T spec, and this
draft to stand on its own in explaining how to deploy G.711.0
compression/decompression with RTP.  If that expectation is incorrect, and
this draft is effectively an RTP Annex to G.711.0 that must be read in concert
with G.711.0, then the first two major issues below are not problems as they
should be obvious in the G.711.0 spec, although the fact that this draft is
effectively an Annex to G.711.0 should be stated.  Otherwise, those two major
issues need attention.

-- Major Issues (4):

[A] Section 4.2.3 specifies a detailed decoding algorithm covering how G.711.0
decompression interacts with received RTP G.711.0 payloads.
A corresponding encoding algorithm specification is needed on the sending side
for G.711.0 compression interaction with RTP sending.
The algorithm will have some decision points in it that cannot be fully
specified, e.g., time coverage of the generated G.711.0 frames.

\begin {Reply to [A]}

I believe you are correct. As with everything associated with G.711.0 , a
longer answer is required.

At the sender end, the G.711.0 encoder itself has decided exactly how it
desires to send compressed G.711.0. As an example outlined earlier in Section
3.3.1 (Multiple G.711.0 Output Frame per RTP Payload Considerations), a given
G.711.0 encoder could choose to encode 20ms of input G.711 symbols as: 1) a
single 20ms G.711.0 frame, or 2) as two 10 ms G.711.0 frames, or 3) any
combination of 5 ms or 10 ms G.711.0 frames. The decision criteria is NOT
SPECIFIED in the ITU-T G.711.0 standard;  a G.711.0 encoder could choose base
on: 1) which encoding produced resulted in fewer bits, 2) simple operation
such as always using 20 ms G.711.0 frames, or 3) any other criteria of its
choosing. Thus the encoding process is NOT DETERMINISTIC in how many G.711.0
frames could represent a given ptime of G.711 symbols.

[Aside: Using a 20 ms ptime example, there could be 1, 2, 3 or 4 G.711.0
frames in a RTP payload in any one of six combinations in a G.711.0 payload
([20ms],[ 10ms:10ms],[10ms:5ms:5ms], [5ms:10ms:5ms],
[5ms:5ms:10ms],[5ms:5ms:5ms:5ms]).]

Thus, it is important to note that the >>G.711.0 STANDARD<< only specifies the
encoding of an individual input G.711 frame (which can only have lengths of
40, 80, 160, 240 or 320 G.711 symbols) to a valid G.711.0 frame.

The authors of this draft assumed that the G.711.0 compressor/encoder provider
has already made the encoding decision on the number of G.711.0 frames
INDEPENDENT of the decompressor/decoder and OUTSIDE any sender-side RTP
payload processing. That is, the G.711.0 encoder just passed the result (any
of the combinations above) the compressor/encoder made to the G.711.0 RTP
layer at the sender to be incorporated into the G.711.0 payload. The RTP layer
could then choose to add padding octets (0x00) to form the final G.711.0
payload.

From that perspective, the co-authors of the draft believed what was important
for the draft was "what could be on-the-wire". However, since the ITU-T
G.711.0 standard only specifies the individual G.711 frame to G.711.0 mapping,
there is a benefit in explicitly calling out the possible "payload encoding
process" in this section (4) as well.

Proposed Action: If my co-authors agree, I could write a very small section
titled "G.711.0 RTP Payload Encoding Process" (inserted in-between the present
4.2.2 and 4.2.3). This paragraph-long section will reverse reference Section
3.3.1 and remind the implementer that they can - at their option - chose to
use any of the allowable encoding possibilities described in it. I think David
is correct, we assumed that some entity PURPOSELY NOT defined by the G.711.0
standard (the provider of the "G.711.0 compressor/encoder") already made those
decisions and that explicit definition of that decision is not specified
anywhere in any SDO document (so why not here?). Indeed, any "standard G.711.0
encoder" offered by a vendor would likely have that functionality within it
(so a RTP implementer wouldn't need to know it either). I could also remind
the reader that one could use a single G.711.0 frame per ptime (if a G.711.0
frame supported that ptime) for the least complicated encoding case. Would
that work David? Would that work co-authors?

\end {Reply to [A]}

[B] The G.711.0 frame format is not specified here, making it very difficult
to figure out what's going on when G.711.0 frames are concatenated.  A
specific example is that the concept of a "prefix code" that occurs at the
start of a G.711.0 frame is far too important to be hidden in step H5 of the
decoding algorithm in Section 4.2.3.

\begin {Reply to [B]}

We welcome comments on how to improve this section, as it is complicated. We
did attempt to describe only what is necessary for understanding.

At the beginning of Section 4.2.3 we IMMEDIATELY reference the ITU-T G.711.0
document - as it is that document that describes how to "decode a G.711.0 bit-
stream". We really want the reader needing to know the details to go there
first. Indeed, the entire G.711.0 payload could be provided to the G.711.0 bit
stream decoder in the ITU-T G.711.0 reference code and obtain all the
uncompressed G.711 samples in the RTP payload and be finished without knowing
anything in this section.

The bit-stream decoder in the ITU-T reference code was defined to parse the
individual compressed G.711.0 frames. However the G.711.0 >>STANDARD ITSELF<<
defines only the mapping between the 40, 80, 160, 240 or 320 G.711 symbols
presented to it and the G.711.0 frame produced from those 40, 80, 160, 240 or
320 samples (i.e., only Section 3.3).

In other words, someone designing a G.711.0 encoder could choose how to
partition the uncompressed G.711 symbols into groups of 40, 80, 160, 240 or
320 samples and then individually encode them into individual G.711.0 frames
as per my reply to [A].

Any arbitrary value corresponding to a valid "G.711.0 prefix code" is NOT
unique (or otherwise special) in that it can be appear anywhere within a
G.711.0 frame; however a given value for a prefix code DOES have a unique
meaning >>TO THE G.711.0 DECODER<< (not the RTP machinery) when it is present
at the beginning of a G.711.0 frame.

The mention of the prefix code (with immediate reference back to  the ITU-T
specification I might add) was simply side information conveyed to the reader
for purposes of understanding. The G.711.0 decoder actually "reads it" and
then uses it to know how many source G.711 to produce (in this case exactly M
G.711 samples). The only thing the G.711.0 RTP implementer needs to know is
that the G.711 sample buffer returned by the G.711.0 decoder will contain
exactly M samples of G.711.

To be precise, the ITU-T specified G.711.0 decoder returns not only the
samples themselves, but the number of samples, M upon its exit (we were not
100% clear on this - fix proposed below). The value of M is important to the
RTP decoding process; the value, structure or meaning of "prefix code" isn't.
The only exception is that 0x00 has a special meaning when it appears where a
prefix code might otherwise be expected.

To accommodate padding, 0x00 may be placed anywhere between the encoded
G.711.0 frames (we only recommend that any desired padding be placed at the
end of the RTP payload). But to convey this "0x00" for padding, we needed to
describe that 0x00 could not be a valid prefix code. If it were not for the
desire for padding, we would not have even mentioned that a "prefix code"
existed in a G.711.0 frame.

In the text we mention that a "0x00" where a prefix code is expected in a
G.711.0 bit stream is "silently ignored" by a G.711.0 frame decoder.

The mention of the prefix code was only for general information of what the
G.711.0 decoder actually does (generally how it decodes the frame and that
"0x00" isn't a valid prefix code) and what is expected by the RTP machinery
when the G.711.0 decoder is finished decoding (the value of M and the M
individual G.711 symbols).

Summary: The interested reader desiring knowledge of how to decode a  G.711.0
bit stream should really read the ITU-T document first; that is why we put the
reference to the "ITU-T G.711.0 Reference code" as the FIRST sentence in
Section 4.2.3. They don't need to know what a "prefix code" is other than it
is used by the G.711.0 decoder to know how many samples (M) it will produce
and that the value of M will be returned by the G.711.0 decoder.

Proposed Action: I would suggest the following change in H5 to make this
clearer:
From: The G.711.0 decoder will produce exactly M G.711 source symbols.
To: Then the ITU-T specified G.711.0 decoder will produce exactly M G.711
source symbols and return both the symbols (in a buffer up to 321 octets in
length if the in-place ITU-T reference code is used) and the value of M upon
exit.

That information - the samples and the value of M - is the only thing the
reader needs to know.

Does that work for you, David?

\end {Reply to [B]}

[C] The discussion of use of the SDP ptime parameter is spread out and
imprecise (is SDP REQUIRED?, when is ptime REQUIRED, RECOMMENDED, or
recommended? - it's not obvious).

A specific example is that this sentence in Section 4.2.4 is an invitation to
interoperability problems ("could infer" - how is that done and where do the
inputs to that inference come from?):

   Similarly, if the number of
   channels was not known, but the payload "ptime" was known, one could
   infer (knowing the sampling rate) how many G.711 symbols each channel
   contained; then with this knowledge determine how many channels of
   data were contained in the payload.

I would suggest that a subsection be added, possibly at the end of Section 3,
to gather/summarize all of the relevant ptime discussion in one place.  I
suspect that the contents of this draft are mostly correct wrt ptime, but it's
hard to figure out what's going on from the current spread-out text.  It looks
like "ptime" could provide a cross-check on correctness of G.711.0 decoding -
see minor issue [G] below.

This major issue [C] is independent of the relationship between this draft and
the G.711.0 spec.

\begin {Reply to [C]}

We underspecified the use of SDP  on purpose, but I also agree that some text
on why we wish to leave it underspecified could be useful. In Section 5 we
simply say "parameters that may be used to configure [G.711.0 RTP
transmission]". Perhaps the MAY should be capitalized? Or more text?

As you know and appreciate, one could put an arbitrary number of G.711.0
frames in a G.711.0 RTP payload and the decoder really won't know how many
G.711 samples were compressed in that payload until it decodes the entire
payload.

Point A: For systems that use SDP and have specified a ptime (IANA
registration for ptime is as an OPTIONAL parameter per WG agreement), a check
can be performed to see if the required number of G.711 samples is present.

Point B: For systems that use SDP and have not specified ptime - the payload
can still be decoded. In this case there is no a priori expectation on the
number of G.711 symbols contained within the G.711.0 RTP payload and thus no
check is possible.

Point C: For systems that use SDP we RECOMMEND that ptime SHOULD be used (see
IANA registration text). The reason is that such a check can be made!

All three points (A, B & C) have been agreed to during previous
meetings/discussions.

However, some USERS of the G.711.0 payload format may wish to use the RTP
format itself but NOT use SDP! A good example is a "in-the-middle" compression
of a G.711 flow (into a G.711.0 flow) and a corresponding decompression of the
G.711.0 flow back into a G.711 flow. This is possible in many network
arrangements (e.g., enterprise to enterprise) where the compression and
decompression endpoints know the PT corresponding to G.711.0 use within their
administrative domain.

[Aside: At one time this RTP Payload format had both the payload definition
(this draft) and G.711.0-specific use cases within it. Previous WG discussion
supported the splitting out of the use-cases into a separate draft (a "G.711.0
use case" draft). I have such an expired draft, but we agreed to defer work on
it until after the RTP payload format was complete. Thus some elements of uses
outside of G.711.0 running in the endpoints would be described in the other
use-case draft.]

The SDP discussion is a little wordy, but this is a result of G.711.0 not
being a codec, but rather a variable length, frame-based lossless
compression/decompression. That is G.711.0 is NOT a (sample-based or frame-
based) codec in the usual sense that RFC 3550/3551 anticipated, but does
require some "G.711 specific" information to be passed to it (e.g., complaw).

For the passage you quoted above, the FOLLOWING TWO SENTENCES in the draft
provide a forward reference in the document to when the "channels" and "ptime"
parameters are needed and referenced (Section 5.1); because we have had no
need prior to that point in the draft to discuss use of ANY particular session
negotiation protocol.

SDP is a dominant IETF protocol for media negotiation; but even RFC 3551
mentions H.245 and the fact that other mapping methods are possible (including
"no negotiation" methods). Indeed, the "in-the-middle" use case described in
this email (and at earlier IETF Payload meetings) may or may not have any a
priori negotiation of PT at all within an administrative domain (e.g., the
G.711.0 PT may be a network configured parameter specific to a company
network).

Proposed Action: The discussion of ptime (and the channels parameter) in
this section is primarily for the purpose of a check. If it is any comfort,
that paragraph has had lots of input to it previously (so you responded to a
complicated issue). And since we have no need to describe "ptime issues" or
session negotiation issues prior to this point (Section 4.2.4) in the document
AND ptime isn't a required negotiation parameter AND we put a forward
reference to Section 5.1 for  "ptime" when SDP is used, I hesitate to mention
such an optional parameter here in Section 3.
Proposed action: No Change (the forward references are enough).

\end {Reply to [C]}

[D] Backwards compatibility.

The problem here is that it's not clear that negotiation (e.g., via SDP) is
required.  This sentence in Section 3.1 is a particular problem:

   G.711.0, being both lossless and stateless, may also be employed as a
   lossless compression mechanism anywhere between end systems which
   have negotiated use of G.711.

That's definitely wrong.  Use of G.711.0 when only G.711 has been negotiated
will fail to interoperate correctly.

A subsection of section 3 on negotiation and SDP usage would help here.

This major issue [D] is independent of the relationship between this draft and
the G.711.0 spec.

\begin {Reply to [D]}

The passage you quote is in Section 3  which is "General Information and Use
of ITU-T G.711.0 Codec) and is: 1) prior to ANY discussion of the use of
G.711.0 in RTP (or even packet networks), and 2) prior to any discussion of
media negotiation when using RTP (e.g., SDP). Thus the context for this
sentence is at the codec bit stream (or packet payload) level of the ITU-T
codec. It stands on its own and is definitely correct.

When the compression of a G.711 payload to a G.711.0 payload occurs somewhere
on the end-to-end path and the corresponding decompression from a G.711.0
payload to a G.711 payload occurs prior to the receiving endpoint the
receiving endpoint doesn't know the (lossless) compression occurred on the
PAYLOAD (the context in this section). As mentioned previously, this is
possible in many arrangements (in RTP) where the compression and decompression
endpoints know the PT corresponding to G.711.0 use within their administrative
domain (a reserved or not-used-in-their-domain PT) and desire to do this.

That is the beauty of lossless compression - the receiving endpoint doesn't
know (or need to know) that payload compression occurred. To imply otherwise
is to dismiss lossless compression (e.g., CRTP, ECRTP, ROHC) that losslessly
compress and decompress arbitrary parts of packets (in the case of
CRTP/ECRTP/RHOC, the headers) in between the endpoints without the endpoints
explicit knowledge of the compression.

Please note that this property isn't possible with lossy *CODECS*, as the
transcode will typically introduce some distortion which would be unknown to
the receiving endpoint but nevertheless present. This is one of the many
subtleties that people reading about G.711.0 have when considering it as if it
were a (lossy) codec - they ASSUME that G.711.0 is a TRANSCODE and not the
lossless, STATELESS compression of the MEDIA PAYLOAD that it is.

Again, we had working group agreement (I think in Quebec) that a use-case
document could follow this G.711.0 RTP payload format document to describe how
to do the mapping in RTP for these "compression-in-the-middle" cases. High
level summary is that you copy the G.711 RTP header verbatim into the G.711.0
RTP header except for the PT. I have a draft on the use case document which I
let expire until this RTP payload definition is finished.

Proposed Action: No Change. We have more than enough words in the document
to describe all the attributes of G.711.0 (Section 3.2) in this section of the
document that discusses properties of the >>ITU-T specification<<.

\end {Reply to [D]}

-- Minor issues (3):

[E] Section 4.1:

   The only significant difference is that the
   payload type (PT) RTP header field will have a value corresponding to
   the dynamic payload type assigned to the flow.  This is in contrast
   to most current uses of G.711 which typically use the static payload
   assignment of PT = 0 (PCMU) or PT = 8 (PCMA) [RFC3551] even though
   the negotiation and use of dynamic payload types is allowed for
   G.711.

I would change "will have" to "MUST have" and add the following sentence:

   The existing G.711 PT values of 0 and 8 MUST NOT be used for G.711.0
   content.

I'm suspect that this is obvious to the authors, but it'll help a reader who's
not familiar with the importance of the difference between G.711 and G.711.0 .

\begin {Reply to [E]}

Proposed Action: Happy to fix both (for the reasons given). However, please
read my reply to [F] below, I believe the rules actually allow PT = [0|8] in a
specific corner case (result is: MUST NOT->SHOULD NOT in your suggestion).

\end {Reply to [E]}

[F] Section 4.1:

      PT - The assignment of an RTP payload type for the format defined
      in this memo is outside the scope of this document.  The RTP
      profiles in use currently mandate binding the payload type
      dynamically for this payload format.

Good start, but not sufficient - cite the "RTP profiles currently in use" and
I would expect those citations to be normative references.

Would that be just RFC 3551 and RFC 4585 (both are already normative
references), or are there more RTP profiles?

\begin {Reply to [F]}

I think that wording was suggested somewhere along the way, but I can't
remember who provided it. It is boilerplate on many RTP payload formats, but
others (such as recent RFC 7310) are as simple as " PT - A dynamic payload
type; MUST be used" (which appears to be incorrect use of the semicolon, but I
digress). In any event, major edits of the first paragraph of 4.1 were made to
include the possibility of G.711 not having PT = 0 or PT =8 for exceptional
cases (so not even static payload types can be automatically assumed).

According to IANA (http://www.iana.org/assignments/rtp-parameters/rtp-
parameters.xhtml#rtp-parameters-2 ) and RFC 3551, the FINAL set of static
payload assignments is contained in Table 4 and 5 of RFC 3551.

And, according to RFC 3551, the PT assigned (for a new codec not having a
static type) chosen SHOULD first attempt to use a dynamic PT - but there are
exceptions cited (e.g., dynamic PT exhaustion). Even codecs that have a static
PT assigned MAY negotiate a different PT (e.g., a dynamic PT). And new codecs
(after exhaustion of dynamic and other types) MAY actually use a static PT not
presently in use (at least I recall someone stated so in a meeting).  So it
appears there are a lot of exception cases that preclude knowing (with 100%
certainty) any particular PT mapping.

And, according to RFC 3551, dynamic payload types SHOULD NOT be used without a
well-defined mechanism to indicate the mapping - SDP or ITU-T H.323/H.245
negotiation or other pre-arrangement are cited (e.g., PT defined within a
certain scope or administrative domain) - and a well-defined RTP payload
format (this draft).

Thus, not much can be said about the assignment other than what was stated. I
could put (yet another) RFC 3551 reference in this paragraph but it would
provide no more guidance than already provided a few paragraphs earlier (which
references RFC 3551). At a minimum I think I should say that PT of 0 and 8
SHOULD NOT be used for G.711.0.

Re: "PTs currently in use". It is hard to differentiate the profiles
"currently IANA registered" and those "currently in use". That is, what is the
definition of "currently in use" when you don't have insight into the
registered-but-not-in-use profiles (e.g., historic codecs).

Proposed Action: I think we should both defer to the Payload WG chairs on
this - as they can be expected to know all the exceptions AND the present
state of verbiage that goes on "PT -" line of an IANA media registration
coming from the Payload WG. Ali and Roni: Please suggest alternate text if you
desire, I will accommodate; otherwise I will leave it as is.

\end {Reply to [F]}

[G] Framing errors

Section 4 generally assumes that the G.711.0 decoder gets handed frames
generated by the G.711.0 encoder and can't get disaligned.  I'm not convinced
that this "just works" based on the text in the draft - major issue [B] is a
significant reason why, and explaining that should help.

Some discussion should be added on why the G.711.0 decoder can't get
disaligned wrt frame boundaries this can't happen, or what the G.711.0 decoder
will do when it discovers that it wasn't handed a complete G.711.0 frame.  For
example, this error case and how to deal with it are not covered by the
algorithm in Section 4.2.3.

\begin {Reply to [G]}

The actual buffer handling to/from G.711.0 encoding/decoding logic is pretty
straightforward so I really doubt that an encoder that has been exercised
sufficiently wouldn't pass the G.711.0 frame(s) to the RTP payload incorrectly
or the converse.

However, you are correct in that we should always specify what happens when
things don't work as expected. Thanks for the catch.

Consistent with an "error condition catch" Richard Barnes made in 4.2.4 - we
do have some information for when an encoder and/or decoder error resulted in
an unexpected number of G.711 decoded symbols.

Assuming ptime was signaled, we expect the number of G.711 decoded symbols to
equal what we expect from the ptime value at the receiver/decoder. If it
doesn't then "we SHOULD discard the packet".

[Aside: We discussed the SHOULD vs MUST on the decoder, the SHOULD won. This
is because a given system design might temporarily send a packet inconsistent
with the ptime previously signaled but which is structurally correct (has the
correct decoded G.711). Such a system might not desire to discard such a
packet (as it might appear otherwise correct in the number of samples
decoded). However, lacking such a design the usual operational choice is to
discard the packet. Thus a SHOULD.]

For the encoder, the length of the G.711.0 RTP payload - excluding padding -
should never be greater than the number of input G.711 symbols plus the number
of G.711.0 frames (as a given G.711.0 frame can be no greater than one octet
more than the number of source symbols). If the number of frames is known to
the RTP layer (it may not be) and this constraint is not met, the source
packet MUST be discarded.

[Aside: We did NOT discuss the SHOULD vs MAY on the encoder. In my opinion,
the MUST is more appropriate - as if the condition is met, you KNOW something
is wrong.]

Proposed Action: Add two sentences similar to the above to the end of
Section 4.2.2+ (proposed earlier new section on Encoding Process) and Section
4.2.3 (Decoding Process).

\end {Reply to [G]}

-- Nits/editorial comments:

Section 3.2:

   A6  Bounded expansion: Since attribute A2 above requires G.711.0 to
         be lossless for any payload, by definition there exists at
         least one potential G.711 payload which must be
         "uncompressible".

The "by definition" statement assumes that every possible bit string is a
valid G.711 input.  If that is correct, it should be explicitly stated.

\begin{nit}

Yes, because Attribute A2 referenced within this sentence quoted says as much.

Every value of a G.711 symbol (2^8) corresponds to a discrete value. There is
no restriction from a sample-to-sample(octet to next octet)  basis assumed in
the G.711 encoding (no "illegal transitions"). Lastly, some "DS0 channels"
assume that all the bits can be used for arbitrary digital data (so-called
ISDN 64kbps B-channel). Thus it is widely known that, by definition, that if
something is random and can take ANY value of ANY possible concatenation of
octets that there is no-redundancy to be exploited in the concatenation for
the purposes of deterministic compression for all possible inputs - there must
exist at least one combination payload that is not compressible.

This is an assertion from the G.711.0 ITU-T document that anyone who cares to
verify can go to the ITU-T, look up G.711 and instantly know that all the
values are "assigned" and there are no illegal transitions specified; thus
there is no redundancy to be exploited. I hesitate to insult my readers by
giving them any more detail than Attribute A2 says.

Proposed Change: None needed. However if you really feel strongly on this, I
could agree to something like the following ... or anything of your choosing
that reads better and is accurate. Let me know what you want.

   A6  Bounded expansion: Since attribute A2 above requires G.711.0 to
         be lossless for any payload (which could consist of any concatenation
         of octets each octet spanning the entire space of 2^8 values), by
definition
        there exists at least one potential G.711 payload which must be
         "uncompressible".

\end {nit}

   A8  Low Complexity: Less than 1.0 WMOPS average and low memory
         footprint (~5k octets RAM, ~5.7k octets ROM and ~3.6 basic
         operations) [ICASSP] [G.711.0].

Expand WMOPS on first use, and check for other acronyms that need to be
expanded on first use.

\begin{nit}

Note: The references define what a WMOPS is.

Recommended Action: Since this is the only use of WMOPS, I will expand it
there (Weighted Million Operations Per Second) and skip the abbreviation
entirely.

RAM and ROM is the only other non-expansion. I trust that these don't qualify
as "needed" as not even the ITU-T document expands these.

Recommended Action: No change to RAM and ROM. It is a reasonable expectation
that anyone reading this document will know those two based on context.

\end {nit}

Section 3.3:

   Since the G.711.0 output frame is "self-describing", a G.711.0
   decoder (process "B") can losslessly reproduce the original G.711
   input frame with only the knowledge of which companding law was used
   (A-law or mu-law).

"companding law"?  The term "compression law" is used elsewhere in this draft,
including two paragraphs earlier in this section - I suggest using
"compression law" consistently.

\begin{nit}

Good catch.

The law both forms of that G.711 uses (mu or A) is that of an input-to-output
compander (http://en.wikipedia.org/wiki/Companding ), where the output format
is discretized.

I will change the one use of "compression law" to "companding law" in its
singular use in Section 3.3 (due to G.711 being a companding, sample-based
codec).

\end {nit}

Section 6:

   We note that something must be stored for any G.711.0 frames that not
   received at the receiving endpoint, no matter what the cause.

"that not" -> "that are not"

\begin{nit}
Thanks. Will do.
\end {nit}

Section 6.2:

   An entire frame of value 0++ or 0-- is expected to be
   extraordinarily rare when the frame was in fact generated by a
   natural signal (on the order of one in 2^{ptime in samples, minus
   one}), as analog inputs such as speech and music are zero-mean and
   are typically acoustically coupled to digital sampling systems.

This doesn't explain where the 2^{ptime in samples, minus one} order of
magnitude estimation came from.  What assumption(s) is(are) being made about
randomness and distribution thereof in the analog input?
It might be simpler to delete the parenthesized text.

\begin{nit}
Agreed. Consider the parenthetical deleted.
\end {nit}

Section 11: Congestion Control

This section is mis-named, as it basically (correctly) says that there is
nothing useful that can be done in G.711.0 compression to respond to
congestion.  I would retitle this to "Congestion Considerations".

\begin{nit}
I would, but the requirements for new RTP payload formats say that there MUST
be a section named "Congestion Control" in all newly approved RTP Payload
formats!

You are, of course, correct - as the text in this section basically says there
is no explicitly way to regulate the bit-rate for the purposes of congestion
control.
\end {nit}

Are there opportunities to respond to congestion elsewhere, e.g.
dynamically change the sampling rate?  If so, a sentence mentioning them would
be good to add.

\begin{nit}
I know of no use of G.711 that changes the sampling frequency from the default
- although that is allowed in the SDP (as G.711 is a sample-based codec). The
8000 samples per second is hard-coded in many voice implementations.

Since the whole purpose of G.711.0 is to send G.711 lossly  with lower
bandwidth, the use of G.711.0 could be triggered by G.711 negotiated sessions
looking for a lower bandwidth solution. Although we could mention this
(obvious) fact, the guidelines for this section instruct me to discuss things
that can be done with the "codec" this payload format describes for the
purposes of congestion control. This is yet another artifact that the new RTP
guidelines did not anticipate the use of a lossless and stateless compression
technique being defined for RTP. We broke a lot of new ground here, thanks for
wading through it!

Proposed Action: None. I would not have this section in the document except
that the new rules for RTP Payload definitions mandate such a section exist.
\end {nit}

idnits 2.13.01 didn't find anything to complain about ;-).

--- Selected RFC 5706 Appendix A Q&A for OPS-Dir review ---

Most of these questions are N/A as this draft specifies a payload format for
RTP, so most of the operations and management concerns are wrt RTP and SDP.

A.1.3.  Has the migration path been discussed?

No, see major issue [D] above.

A.1.4   Have the Requirements on other protocols and functional
       components been discussed?

Only in part - major issues [C] and [D] call out shortcomings in the
discussion of SDP interactions.

A.1.8   Are there fault or threshold conditions that should be reported?

Yes, the likelihood and consequences of framing problems at the G.711.0
decoder (decoder is handed octet strings that are not G.711.0 frames generated
by the encoder) should be discussed.  Major issue [B] needs to be resolved
first, and then see minor issue [G].

A.2.  Management Considerations

I would expect that the media type registration (Section 5.1 of this draft)
results in this new G.711.0 media type being usable in any relevant management
model and/or framework that has some notion of media type.

A.3 Documentation

By itself, this compressed payload format does not look like a likely source
of significant operational impacts on the Internet.

The shepherd's writeup indicates that an implementation exists.

Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
david(_dot_)black(_at_)emc(_dot_)com        Mobile: +1 (978) 394-7754
----------------------------------------------------



<Prev in Thread] Current Thread [Next in Thread>