ietf
[Top] [All Lists]

RE: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03

2014-10-30 10:08:20
David,

The authors of the G.711.0 RTP Payload Draft thank you for the comments below. 
It is clear from the caliber of your comments that you spent a lot of time on 
this.

G.711.0 being a variable length stateless and lossless compression for G.711 (a 
sampled-oriented encoding) causes a lot of confusion to those who occasionally 
think of it as "a codec" instead of the lossless compression mechanism it is.

Thus, this was a hard payload format to write due to some of the pre-conceived 
notions of what G.711.0 is and an even harder one for someone to review (as it 
is not sample-based or fixed-length frame-based encoding that the authors of 
RFC 3550/3511 assumed/envisioned).

So, I really do thank you for the effort here, David. You must have drawn the 
short-straw.

My response to your comments/questions are made in-line below (my comments with 
"\begin {Reply to [issue]}" and my proposed fixes within these are highlighted 
with ">>").

Regards,

Michael A. Ramalho, Ph.D.

-----Original Message-----
From: Black, David [mailto:david(_dot_)black(_at_)emc(_dot_)com] 
Sent: Wednesday, October 22, 2014 11:44 AM
To: Michael Ramalho (mramalho); Paul E. Jones 
(paulej(_at_)packetizer(_dot_)com); 
harada(_dot_)noboru(_at_)lab(_dot_)ntt(_dot_)co(_dot_)jp; 
muthu(_dot_)arul(_at_)gmail(_dot_)com; lei(_dot_)miao(_at_)huawei(_dot_)com; 
General Area Review Team (gen-art(_at_)ietf(_dot_)org); 
ops-dir(_at_)ietf(_dot_)org
Cc: ietf(_at_)ietf(_dot_)org; payload(_at_)ietf(_dot_)org; Black, David
Subject: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03

This is a combined Gen-ART and OPS-DIR review.  Boilerplate for both follows ...

I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, 
please see the FAQ at:

<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.

Please resolve these comments along with any other Last Call comments you may 
receive.

I have reviewed this document as part of the Operational directorate's ongoing 
effort to review all IETF documents being processed by the IESG.  These 
comments were written primarily for the benefit of the operational area 
directors.
Document editors and WG chairs should treat these comments just like any other 
last call comments.

Document: draft-ietf-payload-g7110-03
Reviewer: David Black
Review Date: October 22, 2014
IETF LC End Date: October 27, 2014
IESG Telechat date: October 30, 2014

Summary: This draft is on the right track, but has open issues
                described in the review.

Process note: This is the second draft that I've reviewed recently that has 
been scheduled for an IESG telechat almost immediately following the end of 
IETF Last Call.  The resulting overlap of IETF LC with IESG Evaluation can 
result in significant last-minute changes to the draft when issues are 
discovered during IETF LC.

This draft describes an RTP payload format for carrying G.711.0 compressed 
G.711 voice.  The details of G.711.0 compression are left to the ITU-T G.711.0 
spec (which is fine), and this draft focuses on how to carry the compressed 
results in RTP and conversion to/from uncompressed G.711 voice at the 
communication endpoints.
I found a few major issues and a couple of minor ones, although a couple of the 
major issues depend on a meta-issue, - the intended relationship of this draft 
be to the ITU-T G.711.0 spec.

In general, I expect IETF RFCs to be stand-alone documents that make sense on 
their own, although one may need to read related documents to completely 
understand what's going on.  For this draft, I would expect the actual 
compression/decompression algorithms to be left to the ITU-T spec, and this 
draft to stand on its own in explaining how to deploy G.711.0 
compression/decompression with RTP.  If that expectation is incorrect, and this 
draft is effectively an RTP Annex to G.711.0 that must be read in concert with 
G.711.0, then the first two major issues below are not problems as they should 
be obvious in the G.711.0 spec, although the fact that this draft is 
effectively an Annex to G.711.0 should be stated.  Otherwise, those two major 
issues need attention.

-- Major Issues (4):

[A] Section 4.2.3 specifies a detailed decoding algorithm covering how G.711.0 
decompression interacts with received RTP G.711.0 payloads.
A corresponding encoding algorithm specification is needed on the sending side 
for G.711.0 compression interaction with RTP sending.
The algorithm will have some decision points in it that cannot be fully 
specified, e.g., time coverage of the generated G.711.0 frames.

\begin {Reply to [A]}

I believe you are correct. As with everything associated with G.711.0 , a 
longer answer is required.

At the sender end, the G.711.0 encoder itself has decided exactly how it 
desires to send compressed G.711.0. As an example outlined earlier in Section 
3.3.1 (Multiple G.711.0 Output Frame per RTP Payload Considerations), a given 
G.711.0 encoder could choose to encode 20ms of input G.711 symbols as: 1) a 
single 20ms G.711.0 frame, or 2) as two 10 ms G.711.0 frames, or 3) any 
combination of 5 ms or 10 ms G.711.0 frames. The decision criteria is NOT 
SPECIFIED in the ITU-T G.711.0 standard;  a G.711.0 encoder could choose base 
on: 1) which encoding produced resulted in fewer bits, 2) simple operation such 
as always using 20 ms G.711.0 frames, or 3) any other criteria of its choosing. 
Thus the encoding process is NOT DETERMINISTIC in how many G.711.0 frames could 
represent a given ptime of G.711 symbols.

[Aside: Using a 20 ms ptime example, there could be 1, 2, 3 or 4 G.711.0 frames 
in a RTP payload in any one of six combinations in a G.711.0 payload ([20ms],[ 
10ms:10ms],[10ms:5ms:5ms], [5ms:10ms:5ms], [5ms:5ms:10ms],[5ms:5ms:5ms:5ms]).]

Thus, it is important to note that the >>G.711.0 STANDARD<< only specifies the 
encoding of an individual input G.711 frame (which can only have lengths of 40, 
80, 160, 240 or 320 G.711 symbols) to a valid G.711.0 frame.

The authors of this draft assumed that the G.711.0 compressor/encoder provider 
has already made the encoding decision on the number of G.711.0 frames 
INDEPENDENT of the decompressor/decoder and OUTSIDE any sender-side RTP payload 
processing. That is, the G.711.0 encoder just passed the result (any of the 
combinations above) the compressor/encoder made to the G.711.0 RTP layer at the 
sender to be incorporated into the G.711.0 payload. The RTP layer could then 
choose to add padding octets (0x00) to form the final G.711.0 payload.

From that perspective, the co-authors of the draft believed what was important 
for the draft was "what could be on-the-wire". However, since the ITU-T 
G.711.0 standard only specifies the individual G.711 frame to G.711.0 mapping, 
there is a benefit in explicitly calling out the possible "payload encoding 
process" in this section (4) as well.

Proposed Action: If my co-authors agree, I could write a very small section 
titled "G.711.0 RTP Payload Encoding Process" (inserted in-between the 
present 4.2.2 and 4.2.3). This paragraph-long section will reverse reference 
Section 3.3.1 and remind the implementer that they can - at their option - 
chose to use any of the allowable encoding possibilities described in it. I 
think David is correct, we assumed that some entity PURPOSELY NOT defined by 
the G.711.0 standard (the provider of the "G.711.0 compressor/encoder") 
already made those decisions and that explicit definition of that decision is 
not specified anywhere in any SDO document (so why not here?). Indeed, any 
"standard G.711.0 encoder" offered by a vendor would likely have that 
functionality within it (so a RTP implementer wouldn't need to know it 
either). I could also remind the reader that one could use a single G.711.0 
frame per ptime (if a G.711.0 frame supported that ptime) for the least 
complicated encoding !
 case. Would that work David? Would that work co-authors?

\end {Reply to [A]}

[B] The G.711.0 frame format is not specified here, making it very difficult to 
figure out what's going on when G.711.0 frames are concatenated.  A specific 
example is that the concept of a "prefix code" that occurs at the start of a 
G.711.0 frame is far too important to be hidden in step H5 of the decoding 
algorithm in Section 4.2.3.

\begin {Reply to [B]}

We welcome comments on how to improve this section, as it is complicated. We 
did attempt to describe only what is necessary for understanding.

At the beginning of Section 4.2.3 we IMMEDIATELY reference the ITU-T G.711.0 
document - as it is that document that describes how to "decode a G.711.0 
bit-stream". We really want the reader needing to know the details to go there 
first. Indeed, the entire G.711.0 payload could be provided to the G.711.0 bit 
stream decoder in the ITU-T G.711.0 reference code and obtain all the 
uncompressed G.711 samples in the RTP payload and be finished without knowing 
anything in this section.

The bit-stream decoder in the ITU-T reference code was defined to parse the 
individual compressed G.711.0 frames. However the G.711.0 >>STANDARD ITSELF<< 
defines only the mapping between the 40, 80, 160, 240 or 320 G.711 symbols 
presented to it and the G.711.0 frame produced from those 40, 80, 160, 240 or 
320 samples (i.e., only Section 3.3).

In other words, someone designing a G.711.0 encoder could choose how to 
partition the uncompressed G.711 symbols into groups of 40, 80, 160, 240 or 320 
samples and then individually encode them into individual G.711.0 frames as per 
my reply to [A].

Any arbitrary value corresponding to a valid "G.711.0 prefix code" is NOT 
unique (or otherwise special) in that it can be appear anywhere within a 
G.711.0 frame; however a given value for a prefix code DOES have a unique 
meaning >>TO THE G.711.0 DECODER<< (not the RTP machinery) when it is present 
at the beginning of a G.711.0 frame. 

The mention of the prefix code (with immediate reference back to  the ITU-T 
specification I might add) was simply side information conveyed to the reader 
for purposes of understanding. The G.711.0 decoder actually "reads it" and then 
uses it to know how many source G.711 to produce (in this case exactly M G.711 
samples). The only thing the G.711.0 RTP implementer needs to know is that the 
G.711 sample buffer returned by the G.711.0 decoder will contain exactly M 
samples of G.711.

To be precise, the ITU-T specified G.711.0 decoder returns not only the samples 
themselves, but the number of samples, M upon its exit (we were not 100% clear 
on this - fix proposed below). The value of M is important to the RTP decoding 
process; the value, structure or meaning of "prefix code" isn't. The only 
exception is that 0x00 has a special meaning when it appears where a prefix 
code might otherwise be expected.

To accommodate padding, 0x00 may be placed anywhere between the encoded G.711.0 
frames (we only recommend that any desired padding be placed at the end of the 
RTP payload). But to convey this "0x00" for padding, we needed to describe that 
0x00 could not be a valid prefix code. If it were not for the desire for 
padding, we would not have even mentioned that a "prefix code" existed in a 
G.711.0 frame.

In the text we mention that a "0x00" where a prefix code is expected in a 
G.711.0 bit stream is "silently ignored" by a G.711.0 frame decoder.

The mention of the prefix code was only for general information of what the 
G.711.0 decoder actually does (generally how it decodes the frame and that 
"0x00" isn't a valid prefix code) and what is expected by the RTP machinery 
when the G.711.0 decoder is finished decoding (the value of M and the M 
individual G.711 symbols). 

Summary: The interested reader desiring knowledge of how to decode a  G.711.0 
bit stream should really read the ITU-T document first; that is why we put the 
reference to the "ITU-T G.711.0 Reference code" as the FIRST sentence in 
Section 4.2.3. They don't need to know what a "prefix code" is other than it is 
used by the G.711.0 decoder to know how many samples (M) it will produce and 
that the value of M will be returned by the G.711.0 decoder.

Proposed Action: I would suggest the following change in H5 to make this 
clearer:
From: The G.711.0 decoder will produce exactly M G.711 source symbols.
To: Then the ITU-T specified G.711.0 decoder will produce exactly M G.711 
source symbols and return both the symbols (in a buffer up to 321 octets in 
length if the in-place ITU-T reference code is used) and the value of M upon 
exit.

That information - the samples and the value of M - is the only thing the 
reader needs to know.

Does that work for you, David?

\end {Reply to [B]}

[C] The discussion of use of the SDP ptime parameter is spread out and 
imprecise (is SDP REQUIRED?, when is ptime REQUIRED, RECOMMENDED, or 
recommended? - it's not obvious).

A specific example is that this sentence in Section 4.2.4 is an invitation to 
interoperability problems ("could infer" - how is that done and where do the 
inputs to that inference come from?):

   Similarly, if the number of
   channels was not known, but the payload "ptime" was known, one could
   infer (knowing the sampling rate) how many G.711 symbols each channel
   contained; then with this knowledge determine how many channels of
   data were contained in the payload.

I would suggest that a subsection be added, possibly at the end of Section 3, 
to gather/summarize all of the relevant ptime discussion in one place.  I 
suspect that the contents of this draft are mostly correct wrt ptime, but it's 
hard to figure out what's going on from the current spread-out text.  It looks 
like "ptime" could provide a cross-check on correctness of G.711.0 decoding - 
see minor issue [G] below.

This major issue [C] is independent of the relationship between this draft and 
the G.711.0 spec.

\begin {Reply to [C]}

We underspecified the use of SDP  on purpose, but I also agree that some text 
on why we wish to leave it underspecified could be useful. In Section 5 we 
simply say "parameters that may be used to configure [G.711.0 RTP 
transmission]". Perhaps the MAY should be capitalized? Or more text?

As you know and appreciate, one could put an arbitrary number of G.711.0 frames 
in a G.711.0 RTP payload and the decoder really won't know how many G.711 
samples were compressed in that payload until it decodes the entire payload.

Point A: For systems that use SDP and have specified a ptime (IANA registration 
for ptime is as an OPTIONAL parameter per WG agreement), a check can be 
performed to see if the required number of G.711 samples is present.

Point B: For systems that use SDP and have not specified ptime - the payload 
can still be decoded. In this case there is no a priori expectation on the 
number of G.711 symbols contained within the G.711.0 RTP payload and thus no 
check is possible.

Point C: For systems that use SDP we RECOMMEND that ptime SHOULD be used (see 
IANA registration text). The reason is that such a check can be made!

All three points (A, B & C) have been agreed to during previous 
meetings/discussions.

However, some USERS of the G.711.0 payload format may wish to use the RTP 
format itself but NOT use SDP! A good example is a "in-the-middle" compression 
of a G.711 flow (into a G.711.0 flow) and a corresponding decompression of the 
G.711.0 flow back into a G.711 flow. This is possible in many network 
arrangements (e.g., enterprise to enterprise) where the compression and 
decompression endpoints know the PT corresponding to G.711.0 use within their 
administrative domain.

[Aside: At one time this RTP Payload format had both the payload definition 
(this draft) and G.711.0-specific use cases within it. Previous WG discussion 
supported the splitting out of the use-cases into a separate draft (a "G.711.0 
use case" draft). I have such an expired draft, but we agreed to defer work on 
it until after the RTP payload format was complete. Thus some elements of uses 
outside of G.711.0 running in the endpoints would be described in the other 
use-case draft.]

The SDP discussion is a little wordy, but this is a result of G.711.0 not being 
a codec, but rather a variable length, frame-based lossless 
compression/decompression. That is G.711.0 is NOT a (sample-based or 
frame-based) codec in the usual sense that RFC 3550/3551 anticipated, but does 
require some "G.711 specific" information to be passed to it (e.g., complaw).

For the passage you quoted above, the FOLLOWING TWO SENTENCES in the draft 
provide a forward reference in the document to when the "channels" and "ptime" 
parameters are needed and referenced (Section 5.1); because we have had no need 
prior to that point in the draft to discuss use of ANY particular session 
negotiation protocol.

SDP is a dominant IETF protocol for media negotiation; but even RFC 3551 
mentions H.245 and the fact that other mapping methods are possible (including 
"no negotiation" methods). Indeed, the "in-the-middle" use case described in 
this email (and at earlier IETF Payload meetings) may or may not have any a 
priori negotiation of PT at all within an administrative domain (e.g., the 
G.711.0 PT may be a network configured parameter specific to a company network).

Proposed Action: The discussion of ptime (and the channels parameter) in this 
section is primarily for the purpose of a check. If it is any comfort, that 
paragraph has had lots of input to it previously (so you responded to a 
complicated issue). And since we have no need to describe "ptime issues" or 
session negotiation issues prior to this point (Section 4.2.4) in the 
document AND ptime isn't a required negotiation parameter AND we put a 
forward reference to Section 5.1 for  "ptime" when SDP is used, I hesitate to 
mention such an optional parameter here in Section 3.
Proposed action: No Change (the forward references are enough).

\end {Reply to [C]}

[D] Backwards compatibility.

The problem here is that it's not clear that negotiation (e.g., via SDP) is 
required.  This sentence in Section 3.1 is a particular problem:

   G.711.0, being both lossless and stateless, may also be employed as a
   lossless compression mechanism anywhere between end systems which
   have negotiated use of G.711.

That's definitely wrong.  Use of G.711.0 when only G.711 has been negotiated 
will fail to interoperate correctly.

A subsection of section 3 on negotiation and SDP usage would help here.

This major issue [D] is independent of the relationship between this draft and 
the G.711.0 spec.

\begin {Reply to [D]}

The passage you quote is in Section 3  which is "General Information and Use of 
ITU-T G.711.0 Codec) and is: 1) prior to ANY discussion of the use of G.711.0 
in RTP (or even packet networks), and 2) prior to any discussion of media 
negotiation when using RTP (e.g., SDP). Thus the context for this sentence is 
at the codec bit stream (or packet payload) level of the ITU-T codec. It stands 
on its own and is definitely correct. 

When the compression of a G.711 payload to a G.711.0 payload occurs somewhere 
on the end-to-end path and the corresponding decompression from a G.711.0 
payload to a G.711 payload occurs prior to the receiving endpoint the receiving 
endpoint doesn't know the (lossless) compression occurred on the PAYLOAD (the 
context in this section). As mentioned previously, this is possible in many 
arrangements (in RTP) where the compression and decompression endpoints know 
the PT corresponding to G.711.0 use within their administrative domain (a 
reserved or not-used-in-their-domain PT) and desire to do this.

That is the beauty of lossless compression - the receiving endpoint doesn't 
know (or need to know) that payload compression occurred. To imply otherwise is 
to dismiss lossless compression (e.g., CRTP, ECRTP, ROHC) that losslessly 
compress and decompress arbitrary parts of packets (in the case of 
CRTP/ECRTP/RHOC, the headers) in between the endpoints without the endpoints 
explicit knowledge of the compression.

Please note that this property isn't possible with lossy *CODECS*, as the 
transcode will typically introduce some distortion which would be unknown to 
the receiving endpoint but nevertheless present. This is one of the many 
subtleties that people reading about G.711.0 have when considering it as if it 
were a (lossy) codec - they ASSUME that G.711.0 is a TRANSCODE and not the 
lossless, STATELESS compression of the MEDIA PAYLOAD that it is.

Again, we had working group agreement (I think in Quebec) that a use-case 
document could follow this G.711.0 RTP payload format document to describe how 
to do the mapping in RTP for these "compression-in-the-middle" cases. High 
level summary is that you copy the G.711 RTP header verbatim into the G.711.0 
RTP header except for the PT. I have a draft on the use case document which I 
let expire until this RTP payload definition is finished.

Proposed Action: No Change. We have more than enough words in the document to 
describe all the attributes of G.711.0 (Section 3.2) in this section of the 
document that discusses properties of the >>ITU-T specification<<.

\end {Reply to [D]}

-- Minor issues (3):

[E] Section 4.1:

   The only significant difference is that the
   payload type (PT) RTP header field will have a value corresponding to
   the dynamic payload type assigned to the flow.  This is in contrast
   to most current uses of G.711 which typically use the static payload
   assignment of PT = 0 (PCMU) or PT = 8 (PCMA) [RFC3551] even though
   the negotiation and use of dynamic payload types is allowed for
   G.711.
 
I would change "will have" to "MUST have" and add the following sentence:

   The existing G.711 PT values of 0 and 8 MUST NOT be used for G.711.0
   content.

I'm suspect that this is obvious to the authors, but it'll help a reader who's 
not familiar with the importance of the difference between G.711 and G.711.0 .

\begin {Reply to [E]}

Proposed Action: Happy to fix both (for the reasons given). However, please 
read my reply to [F] below, I believe the rules actually allow PT = [0|8] in 
a specific corner case (result is: MUST NOT->SHOULD NOT in your suggestion).

\end {Reply to [E]}

[F] Section 4.1:

      PT - The assignment of an RTP payload type for the format defined
      in this memo is outside the scope of this document.  The RTP
      profiles in use currently mandate binding the payload type
      dynamically for this payload format.

Good start, but not sufficient - cite the "RTP profiles currently in use" and I 
would expect those citations to be normative references.

Would that be just RFC 3551 and RFC 4585 (both are already normative 
references), or are there more RTP profiles?

\begin {Reply to [F]}

I think that wording was suggested somewhere along the way, but I can't 
remember who provided it. It is boilerplate on many RTP payload formats, but 
others (such as recent RFC 7310) are as simple as " PT - A dynamic payload 
type; MUST be used" (which appears to be incorrect use of the semicolon, but I 
digress). In any event, major edits of the first paragraph of 4.1 were made to 
include the possibility of G.711 not having PT = 0 or PT =8 for exceptional 
cases (so not even static payload types can be automatically assumed).

According to IANA 
(http://www.iana.org/assignments/rtp-parameters/rtp-parameters.xhtml#rtp-parameters-2
 ) and RFC 3551, the FINAL set of static payload assignments is contained in 
Table 4 and 5 of RFC 3551.

And, according to RFC 3551, the PT assigned (for a new codec not having a 
static type) chosen SHOULD first attempt to use a dynamic PT - but there are 
exceptions cited (e.g., dynamic PT exhaustion). Even codecs that have a static 
PT assigned MAY negotiate a different PT (e.g., a dynamic PT). And new codecs 
(after exhaustion of dynamic and other types) MAY actually use a static PT not 
presently in use (at least I recall someone stated so in a meeting).  So it 
appears there are a lot of exception cases that preclude knowing (with 100% 
certainty) any particular PT mapping.

And, according to RFC 3551, dynamic payload types SHOULD NOT be used without a 
well-defined mechanism to indicate the mapping - SDP or ITU-T H.323/H.245 
negotiation or other pre-arrangement are cited (e.g., PT defined within a 
certain scope or administrative domain) - and a well-defined RTP payload format 
(this draft).

Thus, not much can be said about the assignment other than what was stated. I 
could put (yet another) RFC 3551 reference in this paragraph but it would 
provide no more guidance than already provided a few paragraphs earlier (which 
references RFC 3551). At a minimum I think I should say that PT of 0 and 8 
SHOULD NOT be used for G.711.0.

Re: "PTs currently in use". It is hard to differentiate the profiles "currently 
IANA registered" and those "currently in use". That is, what is the definition 
of "currently in use" when you don't have insight into the 
registered-but-not-in-use profiles (e.g., historic codecs).

Proposed Action: I think we should both defer to the Payload WG chairs on 
this - as they can be expected to know all the exceptions AND the present 
state of verbiage that goes on "PT -" line of an IANA media registration 
coming from the Payload WG. Ali and Roni: Please suggest alternate text if 
you desire, I will accommodate; otherwise I will leave it as is.

\end {Reply to [F]}

[G] Framing errors

Section 4 generally assumes that the G.711.0 decoder gets handed frames 
generated by the G.711.0 encoder and can't get disaligned.  I'm not convinced 
that this "just works" based on the text in the draft - major issue [B] is a 
significant reason why, and explaining that should help.

Some discussion should be added on why the G.711.0 decoder can't get disaligned 
wrt frame boundaries this can't happen, or what the G.711.0 decoder will do 
when it discovers that it wasn't handed a complete G.711.0 frame.  For example, 
this error case and how to deal with it are not covered by the algorithm in 
Section 4.2.3.

\begin {Reply to [G]}

The actual buffer handling to/from G.711.0 encoding/decoding logic is pretty 
straightforward so I really doubt that an encoder that has been exercised 
sufficiently wouldn't pass the G.711.0 frame(s) to the RTP payload incorrectly 
or the converse.

However, you are correct in that we should always specify what happens when 
things don't work as expected. Thanks for the catch.

Consistent with an "error condition catch" Richard Barnes made in 4.2.4 - we do 
have some information for when an encoder and/or decoder error resulted in an 
unexpected number of G.711 decoded symbols.

Assuming ptime was signaled, we expect the number of G.711 decoded symbols to 
equal what we expect from the ptime value at the receiver/decoder. If it 
doesn't then "we SHOULD discard the packet".

[Aside: We discussed the SHOULD vs MUST on the decoder, the SHOULD won. This is 
because a given system design might temporarily send a packet inconsistent with 
the ptime previously signaled but which is structurally correct (has the 
correct decoded G.711). Such a system might not desire to discard such a packet 
(as it might appear otherwise correct in the number of samples decoded). 
However, lacking such a design the usual operational choice is to discard the 
packet. Thus a SHOULD.]

For the encoder, the length of the G.711.0 RTP payload - excluding padding - 
should never be greater than the number of input G.711 symbols plus the number 
of G.711.0 frames (as a given G.711.0 frame can be no greater than one octet 
more than the number of source symbols). If the number of frames is known to 
the RTP layer (it may not be) and this constraint is not met, the source packet 
MUST be discarded.

[Aside: We did NOT discuss the SHOULD vs MAY on the encoder. In my opinion, the 
MUST is more appropriate - as if the condition is met, you KNOW something is 
wrong.]

Proposed Action: Add two sentences similar to the above to the end of Section 
4.2.2+ (proposed earlier new section on Encoding Process) and Section 4.2.3 
(Decoding Process).

\end {Reply to [G]}

-- Nits/editorial comments:

Section 3.2:

   A6  Bounded expansion: Since attribute A2 above requires G.711.0 to
         be lossless for any payload, by definition there exists at
         least one potential G.711 payload which must be
         "uncompressible".

The "by definition" statement assumes that every possible bit string is a valid 
G.711 input.  If that is correct, it should be explicitly stated.

\begin{nit}

Yes, because Attribute A2 referenced within this sentence quoted says as much.

Every value of a G.711 symbol (2^8) corresponds to a discrete value. There is 
no restriction from a sample-to-sample(octet to next octet)  basis assumed in 
the G.711 encoding (no "illegal transitions"). Lastly, some "DS0 channels" 
assume that all the bits can be used for arbitrary digital data (so-called ISDN 
64kbps B-channel). Thus it is widely known that, by definition, that if 
something is random and can take ANY value of ANY possible concatenation of 
octets that there is no-redundancy to be exploited in the concatenation for the 
purposes of deterministic compression for all possible inputs - there must 
exist at least one combination payload that is not compressible.

This is an assertion from the G.711.0 ITU-T document that anyone who cares to 
verify can go to the ITU-T, look up G.711 and instantly know that all the 
values are "assigned" and there are no illegal transitions specified; thus 
there is no redundancy to be exploited. I hesitate to insult my readers by 
giving them any more detail than Attribute A2 says.

Proposed Change: None needed. However if you really feel strongly on this, I 
could agree to something like the following ... or anything of your choosing 
that reads better and is accurate. Let me know what you want.

   A6  Bounded expansion: Since attribute A2 above requires G.711.0 to
         be lossless for any payload (which could consist of any concatenation
         of octets each octet spanning the entire space of 2^8 values), by 
definition
        there exists at least one potential G.711 payload which must be
         "uncompressible".

\end {nit}

   A8  Low Complexity: Less than 1.0 WMOPS average and low memory
         footprint (~5k octets RAM, ~5.7k octets ROM and ~3.6 basic
         operations) [ICASSP] [G.711.0].

Expand WMOPS on first use, and check for other acronyms that need to be 
expanded on first use.

\begin{nit}

Note: The references define what a WMOPS is.

Recommended Action: Since this is the only use of WMOPS, I will expand it 
there (Weighted Million Operations Per Second) and skip the abbreviation 
entirely.

RAM and ROM is the only other non-expansion. I trust that these don't qualify 
as "needed" as not even the ITU-T document expands these.

Recommended Action: No change to RAM and ROM. It is a reasonable expectation 
that anyone reading this document will know those two based on context.

\end {nit}

Section 3.3:

   Since the G.711.0 output frame is "self-describing", a G.711.0
   decoder (process "B") can losslessly reproduce the original G.711
   input frame with only the knowledge of which companding law was used
   (A-law or mu-law).

"companding law"?  The term "compression law" is used elsewhere in this draft, 
including two paragraphs earlier in this section - I suggest using "compression 
law" consistently.

\begin{nit}

Good catch.

The law both forms of that G.711 uses (mu or A) is that of an input-to-output 
compander (http://en.wikipedia.org/wiki/Companding ), where the output format 
is discretized.

I will change the one use of "compression law" to "companding law" in its 
singular use in Section 3.3 (due to G.711 being a companding, sample-based 
codec).

\end {nit}

Section 6:

   We note that something must be stored for any G.711.0 frames that not
   received at the receiving endpoint, no matter what the cause.

"that not" -> "that are not"

\begin{nit}
Thanks. Will do.
\end {nit}

Section 6.2:

   An entire frame of value 0++ or 0-- is expected to be
   extraordinarily rare when the frame was in fact generated by a
   natural signal (on the order of one in 2^{ptime in samples, minus
   one}), as analog inputs such as speech and music are zero-mean and
   are typically acoustically coupled to digital sampling systems.

This doesn't explain where the 2^{ptime in samples, minus one} order of 
magnitude estimation came from.  What assumption(s) is(are) being made about 
randomness and distribution thereof in the analog input?
It might be simpler to delete the parenthesized text.

\begin{nit}
Agreed. Consider the parenthetical deleted.
\end {nit}

Section 11: Congestion Control

This section is mis-named, as it basically (correctly) says that there is 
nothing useful that can be done in G.711.0 compression to respond to 
congestion.  I would retitle this to "Congestion Considerations".

\begin{nit}
I would, but the requirements for new RTP payload formats say that there MUST 
be a section named "Congestion Control" in all newly approved RTP Payload 
formats!

You are, of course, correct - as the text in this section basically says there 
is no explicitly way to regulate the bit-rate for the purposes of congestion 
control.
\end {nit}

Are there opportunities to respond to congestion elsewhere, e.g.
dynamically change the sampling rate?  If so, a sentence mentioning them would 
be good to add.

\begin{nit}
I know of no use of G.711 that changes the sampling frequency from the default 
- although that is allowed in the SDP (as G.711 is a sample-based codec). The 
8000 samples per second is hard-coded in many voice implementations.

Since the whole purpose of G.711.0 is to send G.711 lossly  with lower 
bandwidth, the use of G.711.0 could be triggered by G.711 negotiated sessions 
looking for a lower bandwidth solution. Although we could mention this 
(obvious) fact, the guidelines for this section instruct me to discuss things 
that can be done with the "codec" this payload format describes for the 
purposes of congestion control. This is yet another artifact that the new RTP 
guidelines did not anticipate the use of a lossless and stateless compression 
technique being defined for RTP. We broke a lot of new ground here, thanks for 
wading through it!

Proposed Action: None. I would not have this section in the document except 
that the new rules for RTP Payload definitions mandate such a section exist.
\end {nit}

idnits 2.13.01 didn't find anything to complain about ;-).

--- Selected RFC 5706 Appendix A Q&A for OPS-Dir review ---

Most of these questions are N/A as this draft specifies a payload format for 
RTP, so most of the operations and management concerns are wrt RTP and SDP.

A.1.3.  Has the migration path been discussed?

No, see major issue [D] above.

A.1.4   Have the Requirements on other protocols and functional
       components been discussed?

Only in part - major issues [C] and [D] call out shortcomings in the discussion 
of SDP interactions.

A.1.8   Are there fault or threshold conditions that should be reported?

Yes, the likelihood and consequences of framing problems at the G.711.0 decoder 
(decoder is handed octet strings that are not G.711.0 frames generated by the 
encoder) should be discussed.  Major issue [B] needs to be resolved first, and 
then see minor issue [G].

A.2.  Management Considerations

I would expect that the media type registration (Section 5.1 of this draft) 
results in this new G.711.0 media type being usable in any relevant management 
model and/or framework that has some notion of media type.

A.3 Documentation

By itself, this compressed payload format does not look like a likely source of 
significant operational impacts on the Internet.

The shepherd's writeup indicates that an implementation exists.

Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA  01748
+1 (508) 293-7953             FAX: +1 (508) 293-7786
david(_dot_)black(_at_)emc(_dot_)com        Mobile: +1 (978) 394-7754
----------------------------------------------------



<Prev in Thread] Current Thread [Next in Thread>