This definition for Audio looks good, 'cept for a couple of things.
For some reason when reading it I started wondering about how the data
is laid out in the body part. Since the text describes it as `8 bit'
then it assumably is a time-sequence with one byte (octet) per sample.
How does it get laid out if channels != 1? Does the word "interleaved"
mean if channels != 1 then you have stereo? Rather, does "interleaved"
mean to play all the channels at the same moment?
Also "samples/second" seems like an almost needless low level detail
which a lot of "end users" will be annoyed to have to provide. Actually
I do understand where samples/second comes from and why it is
important since that is minutely important when you're operating
in those low level details. But since the stereotypical end-user
doesn't operate on low level details, they will tend to annoy.
I am not asking for a change but, instead, pointing out things which
may need a bit more verbiage in the RFC.
David