ietf-822
[Top] [All Lists]

Audio type

1991-10-22 01:01:21
Although I am happy with the new draft (I already have an implementation
in alpha), I find that the audio type, as specified, is insufficient to
achieve interoperability in the real-world.

One might argue that it is too early to have "standardized" audio.  I
disagree.  We need some level of standardization for basic capabilities.
In the future, we might agree on a better format for standardized audio,
but we need something in XXXX now.  If we don't have something real in
XXXX now, then each implementor is going to go off and do their own
favorite format.  This will not be a good thing.

So, after doing a modest survey of things for the last month, I propose
the following changes be made to the latest draft.  The Audio/Basic
subtype is rich enough to deal with most kinds of sounds being
captured/exchanged today and is straight-forward to implement.  The fact
that I am leveraging off an existing format is merely gravy.

/mtr

                       -- Editing Instructions --


Replace section 7.7 with the text below:
///////
7.7     The Audio Content-Type

A Content-type of "audio" indicates that the body or body part contains
audio data.  Although there is not yet a consensus on an "ideal" audio
format for use with computers, there is a pressing need for a format
capable of providing interoperable behavior.

The initial subtype of "basic" is specified to meet this requirement.

The BNF for the audio type is as follows:

audio-content-type := "audio" "/" audio-subtype
                        *[ ";" attribute "=" value ]

audio-subtype := "basic" / x-token

7.1.1   The Audio/Basic subtype

No parameters are present when this subtype is present.

Audio data encoded in three parts: a header, containing fields that
describe the audio encoding format; a variable-length information field,
in which, for instance, ASCII annotation may be stored; and, the actual
encoded audio.  The header and data fields are written using big-endian
ordering.

The header part consists of six 32-bit quantities, in this order:

longword        field           description
--------        -----           -----------
    0           magic number    the value 0x2e736e64 (ASCII ".snd")

    1           data offset     the offset, in octets, to the data part.
                                The minimum valid number is 24 (decimal).

    2           data size       the size in octets, of the data part.
                                If unknown, the value 0xffffffff should
                                be used.

    3           encoding        the data encoding format:

                                    value       format
                                      1         8-bit ISDN u-law
                                      2         8-bit linear PCM [REF-PCM]
                                      3         16-bit linear PCM
                                      4         24-bit linear PCM
                                      5         32-bit linear PCM
                                      6         32-bit IEEE floating point
                                      7         64-bit IEEE floating point
                                     23         8-bit ISDN u-law compressed
                                                using the CCITT G.721 ADPCM
                                                voice data encoding scheme.

    4           sample rate     the number of samples/second (e.g., 8000)

    5           channels        the number of interleaved channels (e.g., 1)


The information part, consists of 0 or more octets, and starts 24 octets
after the beginning of the header part. The length of the information
part is calculated by subtracting 24 (decimal) from the data offset
field in the header part.

In the interests of interoperability, senders are encouraged to use,
whenever possible, the 8-bit u-law encoding format.  Further, receivers
are encouraged to support, whenever possible, all of the encoding formats.
See paragraph 6 of Appendix A.

    NOTE: This format is a subset of the one described in the NeXT
    Publication "NeXTStep Programmers Guide", in the chapter on "Basic
    Sound Concepts", and also in the SMI manual entry for audio_intro(3).
///////

Add to Appendix A -- Minimal Conformance with This Memo
///////
6. An implementation need not support the Audio content-type value in
order to be conformant to this memo.  However, if conformance to the
Audio content-type value is claimed, then the implementation must be
able to recognize the Audio/Basic subtype, and to render audio encoded
in the 8-bit ISDN u-law format.  Support for other audio encodings is
optional. 
///////

<Prev in Thread] Current Thread [Next in Thread>