Sound programming

kid joe · Jul 7, 2008

Hello

I've got interested in learning some basic sound programming bits in
C... mainly I want to know how to go about accessing the sound devices -
reading from them mainly - in windows and linux... I'd kind of like to be
able to do it without a whole bunch of extra garbage added in there - by
this I mean that I know in windows there are a million sound programming
packages that make the whole process "easier" - there are also a few in
linux but I think the raw stuff I'm interested in understanding is a bit
more simple in linux b/c of the way devices work in it.

So if anyone can point me at a place to start - maybe some really raw
source code for linux and windows - I would really appreciate it.

Thanks!

Walter Roberson · Jul 7, 2008

kid joe said:
I've got interested in learning some basic sound programming bits in
C... mainly I want to know how to go about accessing the sound devices -
reading from them mainly - in windows and linux...

There is no mechanism provided by the standard C language to handle
sound devices (or graphics, or mice, or printers, or any other kind of
device.) All sound handling is system and operating-system dependant.

So if anyone can point me at a place to start - maybe some really raw
source code for linux and windows - I would really appreciate it.

You should check with newsgroups that are specific to your
operating systems, and also with information sources specific
to the brand and model of sound hardware. "raw" sound processing
might require that you program at the device driver level in order
to have the privilege to access the necessary hardware registers.

cr88192 · Jul 8, 2008

Malcolm McLean said:
It is rather more involved than you think.

The problem is that audio devices need to be fed a continuous stream of
raw bits, whilst generally you want the processor to spend most of its
time dealing with the rest of the program, like moving space invaders
about the screen.

yes, and sadly, getting this write without blocking the app (or causing
annoying auditory artifacts) is a little harder than it may seem (or at
least for single-threaded apps).

So unless you want to do difficult multi-tasking programming at the device
level, you need a certain layer of abstraction. the question then becomes
"which one?". For space invaders you can probably get away with an
interface that says "play sound". It puts a bleep or an explosion into the
audio queue, return scontro, to you almost immediately, and a millesecond
or so later you'll hear the sound on the speakers.
For a more advanced use of audio, this isn't sufficient. You'll want to be
able to cancel jobs, to submit long sequences instead of tiny clips, to
change the volume, to stream sound in from a backing store, maybe to
synthesise samples on the fly.

So it becomes difficult to know what level of abstration to use. Too low
and you're doing messy parallel programming, too high and you're calling
Midi instruments and the like when you just want to say "play this".

a generally workable approach I had found was to implement a mixer, which
created temporary "mix streams". these streams basically just provided a
means for the mixer to demand a certain number of samples. the streams
themselves had a various info (current origin, spatial velocity, ...)
allowing for effects like doppler shifting (as well as just the "things are
quieter when far away" effect).

these were typically structs making use of callbacks.

playing a sound typically involved creating a stream with the right
properties (handled automatically by various "play a sound") functions, the
stream typically automatically destroying itself when done.

the interface also worked fairly well with playing audio from videos, and
from songs in the form of mp3s (typically, sound effects are just buffered
into ram, but songs are better streamed since they can take a decent-sized
chunk of memory to store).

whatever else can be played so long as the right callbacks could be
provided.

note: callbacks may be passed a chunk of "user data" (as well as a stream
id), which is another useful trick here, and is put in the struct when
creating the stream. this is usually a pointer holding whatever it is the
stream-specific functions feel is important (GTK does similar...).

also note:
as an interesting effect of having doppler shifting and other effects, not
all of the streams may be strictly temporally in-sync, since moving away
from a stream causes it to be played slower and moving towards it makes it
play faster (I make sounds just "cut out" near mach-1, since otherwise there
are annoying zero-division issues).

a related trick was to add a delay calculated from the distance of the sound
from the camera, such that, say, a distant explosion will take a little
while for the sound to hit (first we see the explosion, and then the sound
hits a short time later).

note that as an effect of the geometry: when one is far away from an audio
source they are out of sync with it (temporally and possibly also in terms
of rate), but as they move closer the sync is regained, such that upon
reaching the source it is playing more or less in realtime (and other
sources they were nearby originally have moved out of sync).

....

one notable lacking effect though is echo-modeling (or dealing with sounds
being otherwise blocked or distorted by geometry), since this is
computationally expensive (an "echo effect", "dampen effect", ... being much
cheaper).

Sigmund LappegÃ¥rd Lahn · Jul 8, 2008

kid said:
Hello

I've got interested in learning some basic sound programming bits in
C... mainly I want to know how to go about accessing the sound devices -
reading from them mainly - in windows and linux... I'd kind of like to be
able to do it without a whole bunch of extra garbage added in there - by
this I mean that I know in windows there are a million sound programming
packages that make the whole process "easier" - there are also a few in
linux but I think the raw stuff I'm interested in understanding is a bit
more simple in linux b/c of the way devices work in it.

So if anyone can point me at a place to start - maybe some really raw
source code for linux and windows - I would really appreciate it.

Thanks!

Sound programming in C is involved and highly system dependent. A cross
platform helper library would _be_ "a bunch of garbage added in there", and
would not (nessescarily) reflect the way the sound hardware works in
practice.

The least crufty library I know of only does sound output --
http://xiph.org/ao/

If you are interested in sound synthesis or analysis I would recommend Chuck
instead -- http://chuck.cs.princeton.edu/

Of course, there is always Pure Data (http://puredata.info/) or its
commercial sibling, Max/MSP (http://www.cycling74.com/)

-Sigmund

kid joe · Jun 8, 2009

On Sun, 07 Jun 2009 21:42:55 +0100, kid joe wrote:
<snip>

Hi all,

Id like to point out that the above message was a forgery not sent by me.
I know that sound programming is OT here.

Cheers,
Joe

Guest · Jun 8, 2009

Id like to point out that the above message was a forgery not sent by me.
I know that sound programming is OT here.

Cheers,
Joe

--
...................... o _______________ _,
` Good Morning! , /\_ _| | .-'_|
`................, _\__`[_______________| _| (_|
] [ \, ][ ][ (_|

I *thought* it was odd that the ascii art was missing!

BartC · Jun 8, 2009

kid joe said:
On Sun, 07 Jun 2009 21:42:55 +0100, kid joe wrote:
<snip>

Hi all,

Id like to point out that the above message was a forgery not sent by me.
I know that sound programming is OT here.

I was thinking of offering him this program to help him get started. I'm
glad I didn't bother now.

#include <stdio.h>

int main(void){
printf("\a");
}

cr88192 · Jun 8, 2009

BartC said:
I was thinking of offering him this program to help him get started. I'm
glad I didn't bother now.

#include <stdio.h>

int main(void){
printf("\a");
}

would be ammusing though if there were an OS where one could do, for
example, midi via printf statements...

all one has to do is embed each midi command into a form vaguely resembling
ANSI codes (but with no timing info, so the app would have to implement its
own time delays and send midi in real-time, or the delays would be
optional...).

ok, going OT:

(actually, I am almost tempted now to make a textual midi serialization...
unlike traditional midi it would be "human readable...", and potentially
even "human editable", and FWIW we need not care much if it is 2x-5x the
original size...).

T8 C0,1 N0,39,127 T64 M0,39

delay 8 ticks; program-change chan=0 to instrument=1; note-on chan=0,
note=39, vel=127;
delay 64 ticks; note-off chan=0, note=39, vel=127 (default value here).

or, in summary: play a 440Hz note on a piano, with the whole sequence taking
0.6s (default: rate=120, q_note=1s).

a more verbose notation could be easier to follow (for example, command
nmonics in place of letters), but could make sequences intollerably long,
and a learning curve of associating letters with commands may-well be
reasonable (after all, whoever would still have to deal with the "rest" of
midi terribleness...).

or such...

Nate Eldredge · Jun 8, 2009

cr88192 said:
(actually, I am almost tempted now to make a textual midi serialization...
unlike traditional midi it would be "human readable...", and potentially
even "human editable", and FWIW we need not care much if it is 2x-5x the
original size...).

T8 C0,1 N0,39,127 T64 M0,39

delay 8 ticks; program-change chan=0 to instrument=1; note-on chan=0,
note=39, vel=127;
delay 64 ticks; note-off chan=0, note=39, vel=127 (default value here).

or, in summary: play a 440Hz note on a piano, with the whole sequence taking
0.6s (default: rate=120, q_note=1s).

GW-BASIC's PLAY statement did something like this. Simpler, because it
was for a PC speaker instead of MIDI, and with a more music-like
notation (you wrote notes like "a4" rather than giving frequency and
duration), but certainly fun.

luserXtrog · Jun 9, 2009

would be ammusing though if there were an OS where one could do, for
example, midi via printf statements...

all one has to do is embed each midi command into a form vaguely resembling
ANSI codes (but with no timing info, so the app would have to implement its
own time delays and send midi in real-time, or the delays would be
optional...).

ok, going OT:

(actually, I am almost tempted now to make a textual midi serialization....
unlike traditional midi it would be "human readable...", and potentially
even "human editable", and FWIW we need not care much if it is 2x-5x the
original size...).

T8 C0,1 N0,39,127 T64 M0,39

delay 8 ticks; program-change chan=0 to instrument=1; note-on chan=0,
note=39, vel=127;
delay 64 ticks; note-off chan=0, note=39, vel=127 (default value here).

or, in summary: play a 440Hz note on a piano, with the whole sequence taking
0.6s (default: rate=120, q_note=1s).

a more verbose notation could be easier to follow (for example, command
nmonics in place of letters), but could make sequences intollerably long,
and a learning curve of associating letters with commands may-well be
reasonable (after all, whoever would still have to deal with the "rest" of
midi terribleness...).

or such...

I wrote a midi "compiler" years ago to generate .mid files (too cheap
to buy a drum machine). It read a line-oriented sequence of notes,
that was something like this:
<measure>.<beat> <note> <duration>
1.1 A4 .4
1.2 C5 .3
1.3 E5 .2
1.4 A5 .1

Using the duration number, it would add the corresponding note-off
later. IIRC, the midi file format requires a sequence number for
each event, so "compiling" amounted to simply sorting the entries
by number.

But note-by-note quickly became tedious, so it eventually got a
macro-processor, too. It's safely on a dead disk that I've yet
get looked at.

cr88192 · Jun 9, 2009

would be ammusing though if there were an OS where one could do, for
example, midi via printf statements...

all one has to do is embed each midi command into a form vaguely
resembling
ANSI codes (but with no timing info, so the app would have to implement
its
own time delays and send midi in real-time, or the delays would be
optional...).

ok, going OT:

(actually, I am almost tempted now to make a textual midi serialization...
unlike traditional midi it would be "human readable...", and potentially
even "human editable", and FWIW we need not care much if it is 2x-5x the
original size...).

T8 C0,1 N0,39,127 T64 M0,39

delay 8 ticks; program-change chan=0 to instrument=1; note-on chan=0,
note=39, vel=127;
delay 64 ticks; note-off chan=0, note=39, vel=127 (default value here).

or, in summary: play a 440Hz note on a piano, with the whole sequence
taking
0.6s (default: rate=120, q_note=1s).

a more verbose notation could be easier to follow (for example, command
nmonics in place of letters), but could make sequences intollerably long,
and a learning curve of associating letters with commands may-well be
reasonable (after all, whoever would still have to deal with the "rest" of
midi terribleness...).

or such...

<
I wrote a midi "compiler" years ago to generate .mid files (too cheap
to buy a drum machine). It read a line-oriented sequence of notes,
that was something like this:
<measure>.<beat> <note> <duration>
1.1 A4 .4
1.2 C5 .3
1.3 E5 .2
1.4 A5 .1

Using the duration number, it would add the corresponding note-off
later. IIRC, the midi file format requires a sequence number for
each event, so "compiling" amounted to simply sorting the entries
by number.

But note-by-note quickly became tedious, so it eventually got a
macro-processor, too. It's safely on a dead disk that I've yet
get looked at.
I wrote a midi synth a few months ago (takes midi, produces waveform...).

my idea was to make a more-or-less textual transcription of the MIDI opcode
stream (although I would change delays into commands). this way, I would
have the full capabilities of the synth at-hand.

it is not clear if some "gloss" would be in-line here...

luserXtrog · Jun 9, 2009

[overdue snippage]

I wrote a midi synth a few months ago (takes midi, produces waveform...).

my idea was to make a more-or-less textual transcription of the MIDI opcode
stream (although I would change delays into commands). this way, I would
have the full capabilities of the synth at-hand.

That sounds cool. But hopefully you'll have some niceties
like refering to notes by name, velocities by dynamic
(pp,p,mp,mf,f,ff), and those variable-length numbers.

it is not clear if some "gloss" would be in-line here...

I don't understand that fragment. Do you mean we shouldn't
hijack this thread for midi stuff? Do you mean my post
should have led-in to boasting about some old program
with some sort of segue? Do you mean the glissando effect
could be dealt with via some sort of macro-expansion?

cr88192 · Jun 9, 2009

[overdue snippage]

I wrote a midi synth a few months ago (takes midi, produces waveform...).

my idea was to make a more-or-less textual transcription of the MIDI
opcode
stream (although I would change delays into commands). this way, I would
have the full capabilities of the synth at-hand.

<
That sounds cool. But hopefully you'll have some niceties
like refering to notes by name, velocities by dynamic
(pp,p,mp,mf,f,ff), and those variable-length numbers.

it is not clear if some "gloss" would be in-line here...

<
I don't understand that fragment. Do you mean we shouldn't
hijack this thread for midi stuff? Do you mean my post
should have led-in to boasting about some old program
with some sort of segue? Do you mean the glissando effect
could be dealt with via some sort of macro-expansion?
this combines both points.
by "gloss" I meant what you meant be "niceties"...

basically, I am not sure if such niceties would be appropriate, but I guess
some could be added...

possible features being:
note names;
instrument names;
....

so, one could type:
"C0,synbass" vs "C0,38"

but, doing this almost demands command nmonics, ...
"PC 0,synbass NtOn 0,A4,127"

and it is no longer clear exactly how far to go...

but, then again, I guess it could also be argued that if tools were going to
have to rely on lots of knowledge of MIDI, they may almost just as well
craft the opcode stream manually, ... but, at the same time, lots of
knowledge of MIDI would still be necessary, for example, to know how to
operate the various controllers, ...

as well, with so many names and nmonics, there could easily be a 10x-20x
inflation vs the binary form, ...

so, this is why my original idea had leaned more towards minimalism, mostly
so it would be mostly intended for machine processing, and for limited human
usage, rather than something aiming more for human readability... (none the
less, it could still be read and written, even if possibly with the help of
a few text-files containing tables...).

in any case, it will still be far more readable than a hexdump, which would
help with debugging, ... as well as being conviniently representable in C
strings, easier to process and craft than the raw opcodes, ...

as is, spaces will be optional...
"C0,1T8N0,69,127T64M0,64"
would do the same as:
"C0,1 T8 N0,69,127 T64 M0,64"
and the same as:
"C 0,1 T 8 N 0,69,127 T 64 M 0,64"
....

of course, I could use channel numbers 1-16 vs 0-15, so that at least
channel 10 is percussion, vs 9...

"C1,1T8N1,69,127T64M1,64"

or such...

luserXtrog · Jun 10, 2009

"luserXtrog" <[email protected]> wrote in message

Click to expand...

[overdue snippage]

I wrote a midi synth a few months ago (takes midi, produces waveform...).

Click to expand...

my idea was to make a more-or-less textual transcription of the MIDI
opcode
stream (although I would change delays into commands). this way, I would
have the full capabilities of the synth at-hand.

Click to expand...

<
That sounds cool. But hopefully you'll have some niceties
like refering to notes by name, velocities by dynamic
(pp,p,mp,mf,f,ff), and those variable-length numbers.

it is not clear if some "gloss" would be in-line here...

Click to expand...

<
I don't understand that fragment. Do you mean we shouldn't
hijack this thread for midi stuff? Do you mean my post
should have led-in to boasting about some old program
with some sort of segue? Do you mean the glissando effect
could be dealt with via some sort of macro-expansion?

this combines both points.
by "gloss" I meant what you meant be "niceties"...

basically, I am not sure if such niceties would be appropriate, but I guess
some could be added...

Perhaps two phases? Like an assembler and compiler. The niceties
would work at a higher level (perhaps little more than macro
expansion), and the assembler listing would use the thinnest
set of mnemonics necessary not to require instrument numbers and
"note on" to be looked up in order to read.

possible features being:
note names;
instrument names;
...

so, one could type:
"C0,synbass" vs "C0,38"

but, doing this almost demands command nmonics, ...
"PC 0,synbass NtOn 0,A4,127"

Not very hard with an enum, a char *[], and an X-macro.

#define instruments \
X(synbass) \
X(guitar) \
X(organ4)

#define X(a) a,
enum { instruments } einst;
#undef X

#define X(a) #a,
char *sinst[] = { instruments };
#undef X

A for loop with strcmp to turn a string into an enum,
and a simple indexing to turn the enum back to a string.
Quick. Painless.

and it is no longer clear exactly how far to go...

but, then again, I guess it could also be argued that if tools were going to
have to rely on lots of knowledge of MIDI, they may almost just as well
craft the opcode stream manually, ... but, at the same time, lots of
knowledge of MIDI would still be necessary, for example, to know how to
operate the various controllers, ...

as well, with so many names and nmonics, there could easily be a 10x-20x
inflation vs the binary form, ...

so, this is why my original idea had leaned more towards minimalism, mostly
so it would be mostly intended for machine processing, and for limited human
usage, rather than something aiming more for human readability... (none the
less, it could still be read and written, even if possibly with the help of
a few text-files containing tables...).

in any case, it will still be far more readable than a hexdump, which would
help with debugging, ... as well as being conviniently representable in C
strings, easier to process and craft than the raw opcodes, ...
Totally.

as is, spaces will be optional...
"C0,1T8N0,69,127T64M0,64"
would do the same as:
"C0,1 T8 N0,69,127 T64 M0,64"
and the same as:
"C 0,1 T 8 N 0,69,127 T 64 M 0,64"
...

All spaces are optional? That's a little wacko jacko.

of course, I could use channel numbers 1-16 vs 0-15, so that at least
channel 10 is percussion, vs 9...

It's nice when the interface uses the same terminology as the
documentation. If a midi-knowledgeable non-programmer musician
tried to use your system, I suspect (s)he would expect 10 rather
than 9 to be drums.

"C1,1T8N1,69,127T64M1,64"

or such...

Niceties are nice; but I do agree that a barebones version
would be much more flexible for the expert.

cr88192 · Jun 10, 2009

"luserXtrog" <[email protected]> wrote in message

Click to expand...

[overdue snippage]

I wrote a midi synth a few months ago (takes midi, produces
waveform...).

Click to expand...

my idea was to make a more-or-less textual transcription of the MIDI
opcode
stream (although I would change delays into commands). this way, I would
have the full capabilities of the synth at-hand.

Click to expand...

<
That sounds cool. But hopefully you'll have some niceties
like refering to notes by name, velocities by dynamic
(pp,p,mp,mf,f,ff), and those variable-length numbers.

it is not clear if some "gloss" would be in-line here...

Click to expand...

<
I don't understand that fragment. Do you mean we shouldn't
hijack this thread for midi stuff? Do you mean my post
should have led-in to boasting about some old program
with some sort of segue? Do you mean the glissando effect
could be dealt with via some sort of macro-expansion?

this combines both points.
by "gloss" I meant what you meant be "niceties"...

basically, I am not sure if such niceties would be appropriate, but I
guess
some could be added...

<
Perhaps two phases? Like an assembler and compiler. The niceties
would work at a higher level (perhaps little more than macro
expansion), and the assembler listing would use the thinnest
set of mnemonics necessary not to require instrument numbers and
"note on" to be looked up in order to read.
yeah, maybe...

yesterday, I got around to implementing part of the process (binary MIDI ->
ASCII).
on average, there seems to be about a 2.9x inflation...

possible features being:
note names;
instrument names;
...

so, one could type:
"C0,synbass" vs "C0,38"

but, doing this almost demands command nmonics, ...
"PC 0,synbass NtOn 0,A4,127"

<
Not very hard with an enum, a char *[], and an X-macro.

#define instruments \
X(synbass) \
X(guitar) \
X(organ4)

#define X(a) a,
enum { instruments } einst;
#undef X

#define X(a) #a,
char *sinst[] = { instruments };
#undef X

A for loop with strcmp to turn a string into an enum,
and a simple indexing to turn the enum back to a string.
Quick. Painless.
lookups are not too difficult...
the issue though is that it would require more effort to parse the token.

with raw numbers I can get by with a while loop:
i=0; while((*s>='0') && (*s<='9'))i=i*10+((*s++)-'0');

basically, I was designing a character-level syntax, rather than a
tokenizing syntax...
handling multiple token-types in a character-level syntax is a pain...

and it is no longer clear exactly how far to go...

but, then again, I guess it could also be argued that if tools were going
to
have to rely on lots of knowledge of MIDI, they may almost just as well
craft the opcode stream manually, ... but, at the same time, lots of
knowledge of MIDI would still be necessary, for example, to know how to
operate the various controllers, ...

as well, with so many names and nmonics, there could easily be a 10x-20x
inflation vs the binary form, ...

so, this is why my original idea had leaned more towards minimalism,
mostly
so it would be mostly intended for machine processing, and for limited
human
usage, rather than something aiming more for human readability... (none
the
less, it could still be read and written, even if possibly with the help
of
a few text-files containing tables...).

in any case, it will still be far more readable than a hexdump, which
would
help with debugging, ... as well as being conviniently representable in C
strings, easier to process and craft than the raw opcodes, ...
<
Totally.

yep...

I had noticed though that, since my single-letter nmonics (A,B,C,D,E) are
mapped 1:1 with the hex values, in many cases the prefix of many commands
ends up resembling the hex version...

as is, spaces will be optional...
"C0,1T8N0,69,127T64M0,64"
would do the same as:
"C0,1 T8 N0,69,127 T64 M0,64"
and the same as:
"C 0,1 T 8 N 0,69,127 T 64 M 0,64"
...

<
All spaces are optional? That's a little wacko jacko.

the spaces are optional, since to parse them I insert periodic
whitespace-eating loops.
as noted, this is a char-level syntax rather than a tokenizing one.

similarly, for many tasks we don't really need the spaces anyways...

of course, I could use channel numbers 1-16 vs 0-15, so that at least
channel 10 is percussion, vs 9...

<
It's nice when the interface uses the same terminology as the
documentation. If a midi-knowledgeable non-programmer musician
tried to use your system, I suspect (s)he would expect 10 rather
than 9 to be drums.
maybe, I hadn't really that much considered human-produced data to be all
that important...

most data then would likely come either from prior-transcribed binary midi,
or from procedural generation (such as "drum machine" loops, ...).

of course, for procedural generation, support for absolute timecodes (and
the ability to have events out of order) would be convinient, but supporting
this would require a much more complex process to transcribe into binary
midi.

it is much simpler to produce multiple flat streams and then to merge them
later (I already have the logic for this for binary midi, for text midi
likely I would just convert to binary and merge this way, and then probably
convert back to ASCII...).

another possible approach could be to support such an ASCII form as well as
the binary form for driving the synth, but for now this is not needed (I can
convert to binary easily enough before sending it into the synth...).

"C1,1T8N1,69,127T64M1,64"

or such...

<
Niceties are nice; but I do agree that a barebones version
would be much more flexible for the expert.
yep, as well as being fast...

I am actually designing the notation more for machines than for humans, the
textual form is mostly to make inspecting data easier, and also making it a
little less effort to craft the data (vs the binary form where one has to
worry about a lot of little detail things...).

but, the "textual" aspect is more a sideband aspect...

FWIW, I actually before designed a bytecode/interpreter which worked this
way (for GP tasks), where the bytecode was represented in ASCII, and so
could thus be displayed via printf. internally though, it was structured
much like any other simplistic interpreter.

it is also similar to how I often manage "signature strings", where the
ASCII form is cannonical (there is no binary form), but most of the code
which handles them is based around loops and switches, and performs
similarly to if it were a binary representation. (note that sig strings do
not allow any whitespace, every character is meaningful and itself drives
the logic...).

so, ASCII need not be token based and slow...

How to play corresponding sound?	2	Jun 10, 2023
Looking to change programming direction	1	Aug 10, 2022
Bluetooth Speaker Connected But No Sound	1	Sep 29, 2021
Programming Blog	2	Apr 7, 2024
Programming with an old Mac OS 10.11.6	1	Feb 13, 2024
New Programming Language GALAXION	2	Feb 15, 2024
What programming language to choose?	4	Jul 3, 2022
Dynamic programming	3	Jan 9, 2023

Sound programming

kid joe

Walter Roberson

cr88192

Sigmund LappegÃ¥rd Lahn

kid joe

Guest

BartC

cr88192

Nate Eldredge

luserXtrog

cr88192

luserXtrog

cr88192

luserXtrog

cr88192

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads