[Top] [All Lists]

Re: audio, checksums, and trojan horses

1991-11-13 08:33:02
A. 1:  Audio formats - I'd prefer the body part to be self-identifying,
    otherwise there's a question of the proper storage format.

I agree here; having the data be self-describing, especially when its
likely to be saved to a file and be treated as an opaque datatype by
some UAs that are not directly sound-capable.

I find this nearly incomprehensible.  In general, if all our formats
were self-identifying, we wouldn't need a content-type, right?  You'd
just look at the data and say, "Oh, this must be
[richtext/postscript/u-law/whatever]."  But that's not the model we're
using, because in the most general case it can't be done.

The real question, to my mind, is this:  what are we mailing around for
audio?  Are we mailing around audio data, or audio files?  I would
contend that audio data is a much more sensible thing to be mailing
around in general, AND it has the virtue of being much more
standardized.  U-law, for example, has been a stable and
rigorously-defined format since before I was born.  

The only argument I can see for "self-identifying formats" is the
assumption that audio data is going to be written to files which will be
passed on to applications that expect audio files.  This may be what
many people intend to do in certain applications, but it is an extremely
oversimplified assumptions.  Try for a moment to look at this from a
telephony perspective, for example.  North America is covered, end to
end, with a network for which u-law is the native data format.  If a
phone company ever wanted to offer a multimedia mail service, it would
start with a whole bunch of machines that are set up to handle, in
essence, a pipeline of u-law format audio data.  Not audio files, with a
short header followed by u-law, but raw u-law.  It would be perfectly
natural to interchange data between multimedia mail and the existing
voice network.  Such a service would be very naturally serviced by u-law
data.  Of course, it could set up a little "gateway" software that read
audio file headers and discarded them as appropriate, but this would
seem very much a wart from the telephony perspective.

So, we're in a classic situation where there are two applications, each
of which can easily convert from one format to another, but where each
sees a different one of the two formats as being "more natural".  For
telephony, raw audio streams are natural.  For some computer
applications, audio files are natural.  For other computer applications,
however, raw audio streams are also natural -- witness the ability to
just pipe u-law data to or from /dev/audio on suns.  How do we choose?

The bottom line, for me, is this:  THERE IS NO TECHNICAL REASON TO
PREFER ONE FORMAT TO ANOTHER.  I therefore prefer to go with the most
stable standards.  U-law is MUCH better defined and much more
widely-used than any audio header format.  Given that all other things
appear equal, and that both formats are readily translated into the
other, I see no better criterion on which to base the choice.  Make the
audio body parts raw audio, with information on the content-type line
declaring the representation format and any other relevant information.  

Once again, this is not a show-stopper for me, but I think the view that
favors a file format is short-sighted and myopically focused on file
systems rather than the wider world of data interchange.  -- Nathaniel

<Prev in Thread] Current Thread [Next in Thread>