libst - Sound Tools : sound sample file and effects libraries.
cc file.c -o file libst.a
Sound Tools is a library of sound sample file format readers/writers and sound effects
Sound Tools includes skeleton C files to assist you in writing new formats and effects.
The full skeleton driver, skel.c, helps you write drivers for a new format which has data
structures. The simple skeleton drivers help you write a new driver for raw (headerless)
formats, or for formats which just have a simple header followed by raw data.
Most sound sample formats are fairly simple: they are just a string of bytes or words and
are presumed to be sampled at a known data rate. Most of them have a short data structure
at the beginning of the file.
The Sound Tools formats and effects operate on an internal buffer format of signed 32-bit
longs. The data processing routines are called with buffers of these samples, and buffer
sizes which refer to the number of samples processed, not the number of bytes. File read-
ers translate the input samples to signed longs and return the number of longs read. For
example, data in linear signed byte format is left-shifted 24 bits.
This does cause problems in processing the data. For example:
*obuf++ = (*ibuf++ + *ibuf++)/2;
would not mix down left and right channels into one monophonic channel, because the
resulting samples would overflow 32 bits. Instead, the ``avg'' effects must use:
*obuf++ = *ibuf++/2 + *ibuf++/2;
Stereo data is stored with the left and right speaker data in successive samples. Quadra-
phonic data is stored in this order: left front, right front, left rear, right rear.
A format is responsible for translating between sound sample files and an internal buffer.
The internal buffer is store in signed longs with a fixed sampling rate. The format oper-
ates from two data structures: a format structure, and a private structure.
The format structure contains a list of control parameters for the sample: sampling rate,
data size (bytes, words, floats, etc.), encoding (unsigned, signed, logarithmic), number
of sound channels. It also contains other state information: whether the sample file
needs to be byte-swapped, whether fseek() will work, its suffix, its file stream pointer,
its format pointer, and the private structure for the format .
The private area is just a preallocated data array for the format to use however it
wishes. It should have a defined data structure and cast the array to that structure.
See voc.c for the use of a private data area. Voc.c has to track the number of samples it
writes and when finishing, seek back to the beginning of the file and write it out. The
private area is not very large. The ``echo'' effect has to malloc() a much larger area
for its delay line buffers.
A format has 6 routines:
startread Set up the format parameters, or read in a data header, or do what
needs to be done.
read Given a buffer and a length: read up to that many samples, transform
them into signed long integers, and copy them into the buffer. Return
the number of samples actually read.
stopread Do what needs to be done.
startwrite Set up the format parameters, or write out a data header, or do what
needs to be done.
write Given a buffer and a length: copy that many samples out of the buffer,
convert them from signed longs to the appropriate data, and write them
to the file. If it can't write out all the samples, fail.
stopwrite Fix up any file header, or do what needs to be done.
An effects loop has one input and one output stream. It has 5 routines.
getopts is called with a character string argument list for the effect.
start is called with the signal parameters for the input and output streams.
flow is called with input and output data buffers, and (by reference) the
input and output data buffer sizes. It processes the input buffer
into the output buffer, and sets the size variables to the numbers of
samples actually processed. It is under no obligation to read from
the input buffer or write to the output buffer during the same call.
If the call returns ST_EOF then this should be used as an indication
that this effect will no longer read any data and can be used to
switch to drain mode sooner.
drain is called after there are no more input data samples. If the effect
wishes to generate more data samples it copies the generated data into
a given buffer and returns the number of samples generated. If it
fills the buffer, it will be called again, etc. The echo effect uses
this to fade away.
stop is called when there are no more input samples to process. stop may
generate output samples on its own. See echo.c for how to do this,
and see that what it does is absolutely bogus.
Theoretically, formats can be used to manipulate several files inside one program. Multi-
sample files, for example the download for a sampling keyboard, can be handled cleanly
with this feature.
Many computers don't supply arithmetic shifting, so do multiplies and divides instead of
<< and >>. The compiler will do the right thing if the CPU supplies arithmetic shifting.
Do all arithmetic conversions one stage at a time. I've had too many problems with "obvi-
ously clean" combinations.
In general, don't worry about "efficiency". The sox.c base translator is disk-bound on
any machine (other than a 8088 PC with an SMD disk controller). Just comment your code
and make sure it's clean and simple. You'll find that DSP code is extremely painful to
write as it is.
The HCOM format is not re-entrant; it can only be used once in a program.
The program/library interface is pretty weak. There's too much ad-hoc information which a
program is supposed to gather up. Sound Tools wants to be an object-oriented dataflow
October 15 1996 ST(3)