David,
The authors of the G.711.0 RTP Payload Draft thank you for the comments below.
It is clear from the caliber of your comments that you spent a lot of time on
this.
G.711.0 being a variable length stateless and lossless compression for G.711 (a
sampled-oriented encoding) causes a lot of confusion to those who occasionally
think of it as "a codec" instead of the lossless compression mechanism it is.
Thus, this was a hard payload format to write due to some of the pre-conceived
notions of what G.711.0 is and an even harder one for someone to review (as it
is not sample-based or fixed-length frame-based encoding that the authors of
RFC 3550/3511 assumed/envisioned).
So, I really do thank you for the effort here, David. You must have drawn the
short-straw.
My response to your comments/questions are made in-line below (my comments with
"\begin {Reply to [issue]}" and my proposed fixes within these are highlighted
with ">>").
Regards,
Michael A. Ramalho, Ph.D.
-----Original Message-----
From: Black, David [mailto:david(_dot_)black(_at_)emc(_dot_)com]
Sent: Wednesday, October 22, 2014 11:44 AM
To: Michael Ramalho (mramalho); Paul E. Jones
(paulej(_at_)packetizer(_dot_)com);
harada(_dot_)noboru(_at_)lab(_dot_)ntt(_dot_)co(_dot_)jp;
muthu(_dot_)arul(_at_)gmail(_dot_)com; lei(_dot_)miao(_at_)huawei(_dot_)com;
General Area Review Team (gen-art(_at_)ietf(_dot_)org);
ops-dir(_at_)ietf(_dot_)org
Cc: ietf(_at_)ietf(_dot_)org; payload(_at_)ietf(_dot_)org; Black, David
Subject: Gen-ART and OPS-Dir review of draft-ietf-payload-g7110-03
This is a combined Gen-ART and OPS-DIR review. Boilerplate for both follows ...
I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART,
please see the FAQ at:
<http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>.
Please resolve these comments along with any other Last Call comments you may
receive.
I have reviewed this document as part of the Operational directorate's ongoing
effort to review all IETF documents being processed by the IESG. These
comments were written primarily for the benefit of the operational area
directors.
Document editors and WG chairs should treat these comments just like any other
last call comments.
Document: draft-ietf-payload-g7110-03
Reviewer: David Black
Review Date: October 22, 2014
IETF LC End Date: October 27, 2014
IESG Telechat date: October 30, 2014
Summary: This draft is on the right track, but has open issues
described in the review.
Process note: This is the second draft that I've reviewed recently that has
been scheduled for an IESG telechat almost immediately following the end of
IETF Last Call. The resulting overlap of IETF LC with IESG Evaluation can
result in significant last-minute changes to the draft when issues are
discovered during IETF LC.
This draft describes an RTP payload format for carrying G.711.0 compressed
G.711 voice. The details of G.711.0 compression are left to the ITU-T G.711.0
spec (which is fine), and this draft focuses on how to carry the compressed
results in RTP and conversion to/from uncompressed G.711 voice at the
communication endpoints.
I found a few major issues and a couple of minor ones, although a couple of the
major issues depend on a meta-issue, - the intended relationship of this draft
be to the ITU-T G.711.0 spec.
In general, I expect IETF RFCs to be stand-alone documents that make sense on
their own, although one may need to read related documents to completely
understand what's going on. For this draft, I would expect the actual
compression/decompression algorithms to be left to the ITU-T spec, and this
draft to stand on its own in explaining how to deploy G.711.0
compression/decompression with RTP. If that expectation is incorrect, and this
draft is effectively an RTP Annex to G.711.0 that must be read in concert with
G.711.0, then the first two major issues below are not problems as they should
be obvious in the G.711.0 spec, although the fact that this draft is
effectively an Annex to G.711.0 should be stated. Otherwise, those two major
issues need attention.
-- Major Issues (4):
[A] Section 4.2.3 specifies a detailed decoding algorithm covering how G.711.0
decompression interacts with received RTP G.711.0 payloads.
A corresponding encoding algorithm specification is needed on the sending side
for G.711.0 compression interaction with RTP sending.
The algorithm will have some decision points in it that cannot be fully
specified, e.g., time coverage of the generated G.711.0 frames.
\begin {Reply to [A]}
I believe you are correct. As with everything associated with G.711.0 , a
longer answer is required.
At the sender end, the G.711.0 encoder itself has decided exactly how it
desires to send compressed G.711.0. As an example outlined earlier in Section
3.3.1 (Multiple G.711.0 Output Frame per RTP Payload Considerations), a given
G.711.0 encoder could choose to encode 20ms of input G.711 symbols as: 1) a
single 20ms G.711.0 frame, or 2) as two 10 ms G.711.0 frames, or 3) any
combination of 5 ms or 10 ms G.711.0 frames. The decision criteria is NOT
SPECIFIED in the ITU-T G.711.0 standard; a G.711.0 encoder could choose base
on: 1) which encoding produced resulted in fewer bits, 2) simple operation such
as always using 20 ms G.711.0 frames, or 3) any other criteria of its choosing.
Thus the encoding process is NOT DETERMINISTIC in how many G.711.0 frames could
represent a given ptime of G.711 symbols.
[Aside: Using a 20 ms ptime example, there could be 1, 2, 3 or 4 G.711.0 frames
in a RTP payload in any one of six combinations in a G.711.0 payload ([20ms],[
10ms:10ms],[10ms:5ms:5ms], [5ms:10ms:5ms], [5ms:5ms:10ms],[5ms:5ms:5ms:5ms]).]
Thus, it is important to note that the >>G.711.0 STANDARD<< only specifies the
encoding of an individual input G.711 frame (which can only have lengths of 40,
80, 160, 240 or 320 G.711 symbols) to a valid G.711.0 frame.
The authors of this draft assumed that the G.711.0 compressor/encoder provider
has already made the encoding decision on the number of G.711.0 frames
INDEPENDENT of the decompressor/decoder and OUTSIDE any sender-side RTP payload
processing. That is, the G.711.0 encoder just passed the result (any of the
combinations above) the compressor/encoder made to the G.711.0 RTP layer at the
sender to be incorporated into the G.711.0 payload. The RTP layer could then
choose to add padding octets (0x00) to form the final G.711.0 payload.
From that perspective, the co-authors of the draft believed what was important
for the draft was "what could be on-the-wire". However, since the ITU-T
G.711.0 standard only specifies the individual G.711 frame to G.711.0 mapping,
there is a benefit in explicitly calling out the possible "payload encoding
process" in this section (4) as well.
Proposed Action: If my co-authors agree, I could write a very small section
titled "G.711.0 RTP Payload Encoding Process" (inserted in-between the
present 4.2.2 and 4.2.3). This paragraph-long section will reverse reference
Section 3.3.1 and remind the implementer that they can - at their option -
chose to use any of the allowable encoding possibilities described in it. I
think David is correct, we assumed that some entity PURPOSELY NOT defined by
the G.711.0 standard (the provider of the "G.711.0 compressor/encoder")
already made those decisions and that explicit definition of that decision is
not specified anywhere in any SDO document (so why not here?). Indeed, any
"standard G.711.0 encoder" offered by a vendor would likely have that
functionality within it (so a RTP implementer wouldn't need to know it
either). I could also remind the reader that one could use a single G.711.0
frame per ptime (if a G.711.0 frame supported that ptime) for the least
complicated encoding !
case. Would that work David? Would that work co-authors?
\end {Reply to [A]}
[B] The G.711.0 frame format is not specified here, making it very difficult to
figure out what's going on when G.711.0 frames are concatenated. A specific
example is that the concept of a "prefix code" that occurs at the start of a
G.711.0 frame is far too important to be hidden in step H5 of the decoding
algorithm in Section 4.2.3.
\begin {Reply to [B]}
We welcome comments on how to improve this section, as it is complicated. We
did attempt to describe only what is necessary for understanding.
At the beginning of Section 4.2.3 we IMMEDIATELY reference the ITU-T G.711.0
document - as it is that document that describes how to "decode a G.711.0
bit-stream". We really want the reader needing to know the details to go there
first. Indeed, the entire G.711.0 payload could be provided to the G.711.0 bit
stream decoder in the ITU-T G.711.0 reference code and obtain all the
uncompressed G.711 samples in the RTP payload and be finished without knowing
anything in this section.
The bit-stream decoder in the ITU-T reference code was defined to parse the
individual compressed G.711.0 frames. However the G.711.0 >>STANDARD ITSELF<<
defines only the mapping between the 40, 80, 160, 240 or 320 G.711 symbols
presented to it and the G.711.0 frame produced from those 40, 80, 160, 240 or
320 samples (i.e., only Section 3.3).
In other words, someone designing a G.711.0 encoder could choose how to
partition the uncompressed G.711 symbols into groups of 40, 80, 160, 240 or 320
samples and then individually encode them into individual G.711.0 frames as per
my reply to [A].
Any arbitrary value corresponding to a valid "G.711.0 prefix code" is NOT
unique (or otherwise special) in that it can be appear anywhere within a
G.711.0 frame; however a given value for a prefix code DOES have a unique
meaning >>TO THE G.711.0 DECODER<< (not the RTP machinery) when it is present
at the beginning of a G.711.0 frame.
The mention of the prefix code (with immediate reference back to the ITU-T
specification I might add) was simply side information conveyed to the reader
for purposes of understanding. The G.711.0 decoder actually "reads it" and then
uses it to know how many source G.711 to produce (in this case exactly M G.711
samples). The only thing the G.711.0 RTP implementer needs to know is that the
G.711 sample buffer returned by the G.711.0 decoder will contain exactly M
samples of G.711.
To be precise, the ITU-T specified G.711.0 decoder returns not only the samples
themselves, but the number of samples, M upon its exit (we were not 100% clear
on this - fix proposed below). The value of M is important to the RTP decoding
process; the value, structure or meaning of "prefix code" isn't. The only
exception is that 0x00 has a special meaning when it appears where a prefix
code might otherwise be expected.
To accommodate padding, 0x00 may be placed anywhere between the encoded G.711.0
frames (we only recommend that any desired padding be placed at the end of the
RTP payload). But to convey this "0x00" for padding, we needed to describe that
0x00 could not be a valid prefix code. If it were not for the desire for
padding, we would not have even mentioned that a "prefix code" existed in a
G.711.0 frame.
In the text we mention that a "0x00" where a prefix code is expected in a
G.711.0 bit stream is "silently ignored" by a G.711.0 frame decoder.
The mention of the prefix code was only for general information of what the
G.711.0 decoder actually does (generally how it decodes the frame and that
"0x00" isn't a valid prefix code) and what is expected by the RTP machinery
when the G.711.0 decoder is finished decoding (the value of M and the M
individual G.711 symbols).
Summary: The interested reader desiring knowledge of how to decode a G.711.0
bit stream should really read the ITU-T document first; that is why we put the
reference to the "ITU-T G.711.0 Reference code" as the FIRST sentence in
Section 4.2.3. They don't need to know what a "prefix code" is other than it is
used by the G.711.0 decoder to know how many samples (M) it will produce and
that the value of M will be returned by the G.711.0 decoder.
Proposed Action: I would suggest the following change in H5 to make this
clearer:
From: The G.711.0 decoder will produce exactly M G.711 source symbols.
To: Then the ITU-T specified G.711.0 decoder will produce exactly M G.711
source symbols and return both the symbols (in a buffer up to 321 octets in
length if the in-place ITU-T reference code is used) and the value of M upon
exit.
That information - the samples and the value of M - is the only thing the
reader needs to know.
Does that work for you, David?
\end {Reply to [B]}
[C] The discussion of use of the SDP ptime parameter is spread out and
imprecise (is SDP REQUIRED?, when is ptime REQUIRED, RECOMMENDED, or
recommended? - it's not obvious).
A specific example is that this sentence in Section 4.2.4 is an invitation to
interoperability problems ("could infer" - how is that done and where do the
inputs to that inference come from?):
Similarly, if the number of
channels was not known, but the payload "ptime" was known, one could
infer (knowing the sampling rate) how many G.711 symbols each channel
contained; then with this knowledge determine how many channels of
data were contained in the payload.
I would suggest that a subsection be added, possibly at the end of Section 3,
to gather/summarize all of the relevant ptime discussion in one place. I
suspect that the contents of this draft are mostly correct wrt ptime, but it's
hard to figure out what's going on from the current spread-out text. It looks
like "ptime" could provide a cross-check on correctness of G.711.0 decoding -
see minor issue [G] below.
This major issue [C] is independent of the relationship between this draft and
the G.711.0 spec.
\begin {Reply to [C]}
We underspecified the use of SDP on purpose, but I also agree that some text
on why we wish to leave it underspecified could be useful. In Section 5 we
simply say "parameters that may be used to configure [G.711.0 RTP
transmission]". Perhaps the MAY should be capitalized? Or more text?
As you know and appreciate, one could put an arbitrary number of G.711.0 frames
in a G.711.0 RTP payload and the decoder really won't know how many G.711
samples were compressed in that payload until it decodes the entire payload.
Point A: For systems that use SDP and have specified a ptime (IANA registration
for ptime is as an OPTIONAL parameter per WG agreement), a check can be
performed to see if the required number of G.711 samples is present.
Point B: For systems that use SDP and have not specified ptime - the payload
can still be decoded. In this case there is no a priori expectation on the
number of G.711 symbols contained within the G.711.0 RTP payload and thus no
check is possible.
Point C: For systems that use SDP we RECOMMEND that ptime SHOULD be used (see
IANA registration text). The reason is that such a check can be made!
All three points (A, B & C) have been agreed to during previous
meetings/discussions.
However, some USERS of the G.711.0 payload format may wish to use the RTP
format itself but NOT use SDP! A good example is a "in-the-middle" compression
of a G.711 flow (into a G.711.0 flow) and a corresponding decompression of the
G.711.0 flow back into a G.711 flow. This is possible in many network
arrangements (e.g., enterprise to enterprise) where the compression and
decompression endpoints know the PT corresponding to G.711.0 use within their
administrative domain.
[Aside: At one time this RTP Payload format had both the payload definition
(this draft) and G.711.0-specific use cases within it. Previous WG discussion
supported the splitting out of the use-cases into a separate draft (a "G.711.0
use case" draft). I have such an expired draft, but we agreed to defer work on
it until after the RTP payload format was complete. Thus some elements of uses
outside of G.711.0 running in the endpoints would be described in the other
use-case draft.]
The SDP discussion is a little wordy, but this is a result of G.711.0 not being
a codec, but rather a variable length, frame-based lossless
compression/decompression. That is G.711.0 is NOT a (sample-based or
frame-based) codec in the usual sense that RFC 3550/3551 anticipated, but does
require some "G.711 specific" information to be passed to it (e.g., complaw).
For the passage you quoted above, the FOLLOWING TWO SENTENCES in the draft
provide a forward reference in the document to when the "channels" and "ptime"
parameters are needed and referenced (Section 5.1); because we have had no need
prior to that point in the draft to discuss use of ANY particular session
negotiation protocol.
SDP is a dominant IETF protocol for media negotiation; but even RFC 3551
mentions H.245 and the fact that other mapping methods are possible (including
"no negotiation" methods). Indeed, the "in-the-middle" use case described in
this email (and at earlier IETF Payload meetings) may or may not have any a
priori negotiation of PT at all within an administrative domain (e.g., the
G.711.0 PT may be a network configured parameter specific to a company network).
Proposed Action: The discussion of ptime (and the channels parameter) in this
section is primarily for the purpose of a check. If it is any comfort, that
paragraph has had lots of input to it previously (so you responded to a
complicated issue). And since we have no need to describe "ptime issues" or
session negotiation issues prior to this point (Section 4.2.4) in the
document AND ptime isn't a required negotiation parameter AND we put a
forward reference to Section 5.1 for "ptime" when SDP is used, I hesitate to
mention such an optional parameter here in Section 3.
Proposed action: No Change (the forward references are enough).
\end {Reply to [C]}
[D] Backwards compatibility.
The problem here is that it's not clear that negotiation (e.g., via SDP) is
required. This sentence in Section 3.1 is a particular problem:
G.711.0, being both lossless and stateless, may also be employed as a
lossless compression mechanism anywhere between end systems which
have negotiated use of G.711.
That's definitely wrong. Use of G.711.0 when only G.711 has been negotiated
will fail to interoperate correctly.
A subsection of section 3 on negotiation and SDP usage would help here.
This major issue [D] is independent of the relationship between this draft and
the G.711.0 spec.
\begin {Reply to [D]}
The passage you quote is in Section 3 which is "General Information and Use of
ITU-T G.711.0 Codec) and is: 1) prior to ANY discussion of the use of G.711.0
in RTP (or even packet networks), and 2) prior to any discussion of media
negotiation when using RTP (e.g., SDP). Thus the context for this sentence is
at the codec bit stream (or packet payload) level of the ITU-T codec. It stands
on its own and is definitely correct.
When the compression of a G.711 payload to a G.711.0 payload occurs somewhere
on the end-to-end path and the corresponding decompression from a G.711.0
payload to a G.711 payload occurs prior to the receiving endpoint the receiving
endpoint doesn't know the (lossless) compression occurred on the PAYLOAD (the
context in this section). As mentioned previously, this is possible in many
arrangements (in RTP) where the compression and decompression endpoints know
the PT corresponding to G.711.0 use within their administrative domain (a
reserved or not-used-in-their-domain PT) and desire to do this.
That is the beauty of lossless compression - the receiving endpoint doesn't
know (or need to know) that payload compression occurred. To imply otherwise is
to dismiss lossless compression (e.g., CRTP, ECRTP, ROHC) that losslessly
compress and decompress arbitrary parts of packets (in the case of
CRTP/ECRTP/RHOC, the headers) in between the endpoints without the endpoints
explicit knowledge of the compression.
Please note that this property isn't possible with lossy *CODECS*, as the
transcode will typically introduce some distortion which would be unknown to
the receiving endpoint but nevertheless present. This is one of the many
subtleties that people reading about G.711.0 have when considering it as if it
were a (lossy) codec - they ASSUME that G.711.0 is a TRANSCODE and not the
lossless, STATELESS compression of the MEDIA PAYLOAD that it is.
Again, we had working group agreement (I think in Quebec) that a use-case
document could follow this G.711.0 RTP payload format document to describe how
to do the mapping in RTP for these "compression-in-the-middle" cases. High
level summary is that you copy the G.711 RTP header verbatim into the G.711.0
RTP header except for the PT. I have a draft on the use case document which I
let expire until this RTP payload definition is finished.
Proposed Action: No Change. We have more than enough words in the document to
describe all the attributes of G.711.0 (Section 3.2) in this section of the
document that discusses properties of the >>ITU-T specification<<.
\end {Reply to [D]}
-- Minor issues (3):
[E] Section 4.1:
The only significant difference is that the
payload type (PT) RTP header field will have a value corresponding to
the dynamic payload type assigned to the flow. This is in contrast
to most current uses of G.711 which typically use the static payload
assignment of PT = 0 (PCMU) or PT = 8 (PCMA) [RFC3551] even though
the negotiation and use of dynamic payload types is allowed for
G.711.
I would change "will have" to "MUST have" and add the following sentence:
The existing G.711 PT values of 0 and 8 MUST NOT be used for G.711.0
content.
I'm suspect that this is obvious to the authors, but it'll help a reader who's
not familiar with the importance of the difference between G.711 and G.711.0 .
\begin {Reply to [E]}
Proposed Action: Happy to fix both (for the reasons given). However, please
read my reply to [F] below, I believe the rules actually allow PT = [0|8] in
a specific corner case (result is: MUST NOT->SHOULD NOT in your suggestion).
\end {Reply to [E]}
[F] Section 4.1:
PT - The assignment of an RTP payload type for the format defined
in this memo is outside the scope of this document. The RTP
profiles in use currently mandate binding the payload type
dynamically for this payload format.
Good start, but not sufficient - cite the "RTP profiles currently in use" and I
would expect those citations to be normative references.
Would that be just RFC 3551 and RFC 4585 (both are already normative
references), or are there more RTP profiles?
\begin {Reply to [F]}
I think that wording was suggested somewhere along the way, but I can't
remember who provided it. It is boilerplate on many RTP payload formats, but
others (such as recent RFC 7310) are as simple as " PT - A dynamic payload
type; MUST be used" (which appears to be incorrect use of the semicolon, but I
digress). In any event, major edits of the first paragraph of 4.1 were made to
include the possibility of G.711 not having PT = 0 or PT =8 for exceptional
cases (so not even static payload types can be automatically assumed).
According to IANA
(http://www.iana.org/assignments/rtp-parameters/rtp-parameters.xhtml#rtp-parameters-2
) and RFC 3551, the FINAL set of static payload assignments is contained in
Table 4 and 5 of RFC 3551.
And, according to RFC 3551, the PT assigned (for a new codec not having a
static type) chosen SHOULD first attempt to use a dynamic PT - but there are
exceptions cited (e.g., dynamic PT exhaustion). Even codecs that have a static
PT assigned MAY negotiate a different PT (e.g., a dynamic PT). And new codecs
(after exhaustion of dynamic and other types) MAY actually use a static PT not
presently in use (at least I recall someone stated so in a meeting). So it
appears there are a lot of exception cases that preclude knowing (with 100%
certainty) any particular PT mapping.
And, according to RFC 3551, dynamic payload types SHOULD NOT be used without a
well-defined mechanism to indicate the mapping - SDP or ITU-T H.323/H.245
negotiation or other pre-arrangement are cited (e.g., PT defined within a
certain scope or administrative domain) - and a well-defined RTP payload format
(this draft).
Thus, not much can be said about the assignment other than what was stated. I
could put (yet another) RFC 3551 reference in this paragraph but it would
provide no more guidance than already provided a few paragraphs earlier (which
references RFC 3551). At a minimum I think I should say that PT of 0 and 8
SHOULD NOT be used for G.711.0.
Re: "PTs currently in use". It is hard to differentiate the profiles "currently
IANA registered" and those "currently in use". That is, what is the definition
of "currently in use" when you don't have insight into the
registered-but-not-in-use profiles (e.g., historic codecs).
Proposed Action: I think we should both defer to the Payload WG chairs on
this - as they can be expected to know all the exceptions AND the present
state of verbiage that goes on "PT -" line of an IANA media registration
coming from the Payload WG. Ali and Roni: Please suggest alternate text if
you desire, I will accommodate; otherwise I will leave it as is.
\end {Reply to [F]}
[G] Framing errors
Section 4 generally assumes that the G.711.0 decoder gets handed frames
generated by the G.711.0 encoder and can't get disaligned. I'm not convinced
that this "just works" based on the text in the draft - major issue [B] is a
significant reason why, and explaining that should help.
Some discussion should be added on why the G.711.0 decoder can't get disaligned
wrt frame boundaries this can't happen, or what the G.711.0 decoder will do
when it discovers that it wasn't handed a complete G.711.0 frame. For example,
this error case and how to deal with it are not covered by the algorithm in
Section 4.2.3.
\begin {Reply to [G]}
The actual buffer handling to/from G.711.0 encoding/decoding logic is pretty
straightforward so I really doubt that an encoder that has been exercised
sufficiently wouldn't pass the G.711.0 frame(s) to the RTP payload incorrectly
or the converse.
However, you are correct in that we should always specify what happens when
things don't work as expected. Thanks for the catch.
Consistent with an "error condition catch" Richard Barnes made in 4.2.4 - we do
have some information for when an encoder and/or decoder error resulted in an
unexpected number of G.711 decoded symbols.
Assuming ptime was signaled, we expect the number of G.711 decoded symbols to
equal what we expect from the ptime value at the receiver/decoder. If it
doesn't then "we SHOULD discard the packet".
[Aside: We discussed the SHOULD vs MUST on the decoder, the SHOULD won. This is
because a given system design might temporarily send a packet inconsistent with
the ptime previously signaled but which is structurally correct (has the
correct decoded G.711). Such a system might not desire to discard such a packet
(as it might appear otherwise correct in the number of samples decoded).
However, lacking such a design the usual operational choice is to discard the
packet. Thus a SHOULD.]
For the encoder, the length of the G.711.0 RTP payload - excluding padding -
should never be greater than the number of input G.711 symbols plus the number
of G.711.0 frames (as a given G.711.0 frame can be no greater than one octet
more than the number of source symbols). If the number of frames is known to
the RTP layer (it may not be) and this constraint is not met, the source packet
MUST be discarded.
[Aside: We did NOT discuss the SHOULD vs MAY on the encoder. In my opinion, the
MUST is more appropriate - as if the condition is met, you KNOW something is
wrong.]
Proposed Action: Add two sentences similar to the above to the end of Section
4.2.2+ (proposed earlier new section on Encoding Process) and Section 4.2.3
(Decoding Process).
\end {Reply to [G]}
-- Nits/editorial comments:
Section 3.2:
A6 Bounded expansion: Since attribute A2 above requires G.711.0 to
be lossless for any payload, by definition there exists at
least one potential G.711 payload which must be
"uncompressible".
The "by definition" statement assumes that every possible bit string is a valid
G.711 input. If that is correct, it should be explicitly stated.
\begin{nit}
Yes, because Attribute A2 referenced within this sentence quoted says as much.
Every value of a G.711 symbol (2^8) corresponds to a discrete value. There is
no restriction from a sample-to-sample(octet to next octet) basis assumed in
the G.711 encoding (no "illegal transitions"). Lastly, some "DS0 channels"
assume that all the bits can be used for arbitrary digital data (so-called ISDN
64kbps B-channel). Thus it is widely known that, by definition, that if
something is random and can take ANY value of ANY possible concatenation of
octets that there is no-redundancy to be exploited in the concatenation for the
purposes of deterministic compression for all possible inputs - there must
exist at least one combination payload that is not compressible.
This is an assertion from the G.711.0 ITU-T document that anyone who cares to
verify can go to the ITU-T, look up G.711 and instantly know that all the
values are "assigned" and there are no illegal transitions specified; thus
there is no redundancy to be exploited. I hesitate to insult my readers by
giving them any more detail than Attribute A2 says.
Proposed Change: None needed. However if you really feel strongly on this, I
could agree to something like the following ... or anything of your choosing
that reads better and is accurate. Let me know what you want.
A6 Bounded expansion: Since attribute A2 above requires G.711.0 to
be lossless for any payload (which could consist of any concatenation
of octets each octet spanning the entire space of 2^8 values), by
definition
there exists at least one potential G.711 payload which must be
"uncompressible".
\end {nit}
A8 Low Complexity: Less than 1.0 WMOPS average and low memory
footprint (~5k octets RAM, ~5.7k octets ROM and ~3.6 basic
operations) [ICASSP] [G.711.0].
Expand WMOPS on first use, and check for other acronyms that need to be
expanded on first use.
\begin{nit}
Note: The references define what a WMOPS is.
Recommended Action: Since this is the only use of WMOPS, I will expand it
there (Weighted Million Operations Per Second) and skip the abbreviation
entirely.
RAM and ROM is the only other non-expansion. I trust that these don't qualify
as "needed" as not even the ITU-T document expands these.
Recommended Action: No change to RAM and ROM. It is a reasonable expectation
that anyone reading this document will know those two based on context.
\end {nit}
Section 3.3:
Since the G.711.0 output frame is "self-describing", a G.711.0
decoder (process "B") can losslessly reproduce the original G.711
input frame with only the knowledge of which companding law was used
(A-law or mu-law).
"companding law"? The term "compression law" is used elsewhere in this draft,
including two paragraphs earlier in this section - I suggest using "compression
law" consistently.
\begin{nit}
Good catch.
The law both forms of that G.711 uses (mu or A) is that of an input-to-output
compander (http://en.wikipedia.org/wiki/Companding ), where the output format
is discretized.
I will change the one use of "compression law" to "companding law" in its
singular use in Section 3.3 (due to G.711 being a companding, sample-based
codec).
\end {nit}
Section 6:
We note that something must be stored for any G.711.0 frames that not
received at the receiving endpoint, no matter what the cause.
"that not" -> "that are not"
\begin{nit}
Thanks. Will do.
\end {nit}
Section 6.2:
An entire frame of value 0++ or 0-- is expected to be
extraordinarily rare when the frame was in fact generated by a
natural signal (on the order of one in 2^{ptime in samples, minus
one}), as analog inputs such as speech and music are zero-mean and
are typically acoustically coupled to digital sampling systems.
This doesn't explain where the 2^{ptime in samples, minus one} order of
magnitude estimation came from. What assumption(s) is(are) being made about
randomness and distribution thereof in the analog input?
It might be simpler to delete the parenthesized text.
\begin{nit}
Agreed. Consider the parenthetical deleted.
\end {nit}
Section 11: Congestion Control
This section is mis-named, as it basically (correctly) says that there is
nothing useful that can be done in G.711.0 compression to respond to
congestion. I would retitle this to "Congestion Considerations".
\begin{nit}
I would, but the requirements for new RTP payload formats say that there MUST
be a section named "Congestion Control" in all newly approved RTP Payload
formats!
You are, of course, correct - as the text in this section basically says there
is no explicitly way to regulate the bit-rate for the purposes of congestion
control.
\end {nit}
Are there opportunities to respond to congestion elsewhere, e.g.
dynamically change the sampling rate? If so, a sentence mentioning them would
be good to add.
\begin{nit}
I know of no use of G.711 that changes the sampling frequency from the default
- although that is allowed in the SDP (as G.711 is a sample-based codec). The
8000 samples per second is hard-coded in many voice implementations.
Since the whole purpose of G.711.0 is to send G.711 lossly with lower
bandwidth, the use of G.711.0 could be triggered by G.711 negotiated sessions
looking for a lower bandwidth solution. Although we could mention this
(obvious) fact, the guidelines for this section instruct me to discuss things
that can be done with the "codec" this payload format describes for the
purposes of congestion control. This is yet another artifact that the new RTP
guidelines did not anticipate the use of a lossless and stateless compression
technique being defined for RTP. We broke a lot of new ground here, thanks for
wading through it!
Proposed Action: None. I would not have this section in the document except
that the new rules for RTP Payload definitions mandate such a section exist.
\end {nit}
idnits 2.13.01 didn't find anything to complain about ;-).
--- Selected RFC 5706 Appendix A Q&A for OPS-Dir review ---
Most of these questions are N/A as this draft specifies a payload format for
RTP, so most of the operations and management concerns are wrt RTP and SDP.
A.1.3. Has the migration path been discussed?
No, see major issue [D] above.
A.1.4 Have the Requirements on other protocols and functional
components been discussed?
Only in part - major issues [C] and [D] call out shortcomings in the discussion
of SDP interactions.
A.1.8 Are there fault or threshold conditions that should be reported?
Yes, the likelihood and consequences of framing problems at the G.711.0 decoder
(decoder is handed octet strings that are not G.711.0 frames generated by the
encoder) should be discussed. Major issue [B] needs to be resolved first, and
then see minor issue [G].
A.2. Management Considerations
I would expect that the media type registration (Section 5.1 of this draft)
results in this new G.711.0 media type being usable in any relevant management
model and/or framework that has some notion of media type.
A.3 Documentation
By itself, this compressed payload format does not look like a likely source of
significant operational impacts on the Internet.
The shepherd's writeup indicates that an implementation exists.
Thanks,
--David
----------------------------------------------------
David L. Black, Distinguished Engineer
EMC Corporation, 176 South St., Hopkinton, MA 01748
+1 (508) 293-7953 FAX: +1 (508) 293-7786
david(_dot_)black(_at_)emc(_dot_)com Mobile: +1 (978) 394-7754
----------------------------------------------------