Using threads for audio computing?

L

lgabiot

Hello,

I'd like to be able to analyze incoming audio from a sound card using
Python, and I'm trying to establish a correct architecture for this.

Getting the audio is OK (using PyAudio), as well as the calculations
needed, so won't be discussing those, but the general idea of being able
at (roughly) the same time: getting audio, and performing calculation on
it, while not loosing any incoming audio.
I also make the assumption that my calculations on audio will be done
faster than the time I need to get the audio itself, so that the
application would be almost real time.


So far my idea (which works according to the small tests I did) consist
of using a Queue object as a buffer for the incoming audio and two
threads, one to feed the queue, the other to consume it.


The queue could store the audio as a collection of numpy array of x samples.
The first thread work would be to put() into the queue new chunks of
audio as they are received from the audio card, while the second would
get() from the queue chunks and perform the necessary calculations on them.

Am I in the right direction, or is there a better general idea?

Thanks!
 
R

Roy Smith

lgabiot said:
Hello,

I'd like to be able to analyze incoming audio from a sound card using
Python, and I'm trying to establish a correct architecture for this.

Getting the audio is OK (using PyAudio), as well as the calculations
needed, so won't be discussing those, but the general idea of being able
at (roughly) the same time: getting audio, and performing calculation on
it, while not loosing any incoming audio.
I also make the assumption that my calculations on audio will be done
faster than the time I need to get the audio itself, so that the
application would be almost real time.


So far my idea (which works according to the small tests I did) consist
of using a Queue object as a buffer for the incoming audio and two
threads, one to feed the queue, the other to consume it.


The queue could store the audio as a collection of numpy array of x samples.
The first thread work would be to put() into the queue new chunks of
audio as they are received from the audio card, while the second would
get() from the queue chunks and perform the necessary calculations on them.

Am I in the right direction, or is there a better general idea?

Thanks!

If you are going to use threads, the architecture you describe seems
perfectly reasonable. It's a classic producer-consumer pattern.

But, I wonder if you even need anything this complicated. Using a queue
to buffer work between threads makes sense if the workload presented is
uneven. Sometimes you'll get a burst of work all at once and don't have
the capacity to process it in real-time, so you want to buffer it up.

I would think sampling audio would be a steady stream. Every x ms, you
get another chunk of samples, like clockwork. Is this not the case?
 
L

lgabiot

Le 11/05/14 16:40, Roy Smith a écrit :Le 11/05/14 16:40, Roy Smith a écrit :
If you are going to use threads, the architecture you describe seems
perfectly reasonable. It's a classic producer-consumer pattern.

But, I wonder if you even need anything this complicated. Using a queue
to buffer work between threads makes sense if the workload presented is
uneven. Sometimes you'll get a burst of work all at once and don't have
the capacity to process it in real-time, so you want to buffer it up.

I would think sampling audio would be a steady stream. Every x ms, you
get another chunk of samples, like clockwork. Is this not the case?

Thanks for your answer,

yes, I guess I can consider audio as a steady stream. PyAudio gives me
the audio samples by small chunks (2048 samples at a time for instance,
while the sound card gives 48 000 samples/seconds). I accumulate the
samples into a numpy array, and once the numpy array has reached the
needed size (for instance 5 seconds of audio), I put this numpy array in
the queue. So I think you are right in thinking that every 5 seconds I
get a new chunk of audio to work on. Then I perform a calculation on
this 5 seconds of audio (which needs to be done in less than 5 seconds,
so that it will be ready to process the next 5 second chunk), but
meanwhile, I need to still constantly get from Pyaudio a new 5 second
chunk of audio. Hence my system.

I guess if my calculation had to be performed on a small number of
samples (i.e. under the value of the Pyaudio buffer size (2048 samples
for instance), and that the calculation would last less than the time it
takes to get the next 2048 samples from Pyaudio, I wouldn't need the
Queue and Thread system.
But in my case where I need a large buffer, it might not work?
Unless I ask pyaudio to feed me directly with 5 seconds chunks (instead
of the usual buffer sizes: 1024, 2048, etc...), which I didn't try,
because I hadn't though of it.
 
L

lgabiot

Le 11/05/14 17:40, lgabiot a écrit :
I guess if my calculation had to be performed on a small number of
samples (i.e. under the value of the Pyaudio buffer size (2048 samples
for instance), and that the calculation would last less than the time it
takes to get the next 2048 samples from Pyaudio, I wouldn't need the
Queue and Thread system.
But in my case where I need a large buffer, it might not work?
Unless I ask pyaudio to feed me directly with 5 seconds chunks (instead
of the usual buffer sizes: 1024, 2048, etc...), which I didn't try,
because I hadn't though of it.

I guess this solution might probably not work, since it would mean that
the calculation should be quick enough so it wouldn't last longer than 1
sample (1/48000 s for instance), since while doing the calculation, no
audio would be ingested (unless pyAudio possess some kind of internal
concurrency system).
Which leads me to think that a buffer (queue) and separate threads
(producer and consumer) are necessary for this task.

But AFAIK the python GIL (and in smaller or older computers that have
only one core) does not permit true paralell execution of two threads. I
believe it is quite like the way multiple processes are handled by an OS
on a single CPU computer: process A has x CPU cycles, then process B has
y CPU cycles, etc...
So in my case, I must have a way to make sure that:
thread 1 (which gets audio from Pyaudio and put() it in the Queue) is
not interrupted long enough to miss a sample.
If I suppose a worst case scenario for the computer, like a
raspberry-pi, the CPU speed is 700MHz, which gives approx 14 000 CPU
cycles between each audio samples (at 48 kHz FS). I don't know if 14 000
CPU cycle is a lot or not for the tasks at hands.

Well, at least, it is what I understand, but since I'm really both a
beginner and an hobbyist, I might be totally wrong...
 
C

Chris Angelico

But AFAIK the python GIL (and in smaller or older computers that have only
one core) does not permit true paralell execution of two threads. I believe
it is quite like the way multiple processes are handled by an OS on a single
CPU computer: process A has x CPU cycles, then process B has y CPU cycles,
etc...
So in my case, I must have a way to make sure that:
thread 1 (which gets audio from Pyaudio and put() it in the Queue) is not
interrupted long enough to miss a sample.
If I suppose a worst case scenario for the computer, like a raspberry-pi,
the CPU speed is 700MHz, which gives approx 14 000 CPU cycles between each
audio samples (at 48 kHz FS). I don't know if 14 000 CPU cycle is a lot or
not for the tasks at hands.

Well, at least, it is what I understand, but since I'm really both a
beginner and an hobbyist, I might be totally wrong...

The GIL is almost completely insignificant here. One of your threads
will be blocked practically the whole time (waiting for more samples;
collecting them into a numpy array doesn't take long), and the other
is, if I understand correctly, spending most of its time inside numpy,
which releases the GIL. You should be able to thread just fine.

ChrisA
 
L

lgabiot

Le 12/05/14 07:41, Chris Angelico a écrit :
The GIL is almost completely insignificant here. One of your threads
will be blocked practically the whole time (waiting for more samples;
collecting them into a numpy array doesn't take long), and the other
is, if I understand correctly, spending most of its time inside numpy,
which releases the GIL. You should be able to thread just fine.

ChrisA
Thanks Chris for your answer.

So back to my original question: A Queue and two threads
(producer/consumer) seems a good answer to my problem, or is there a
better way to solve it?
(again, I'm really a beginner, so I made up this solution, but really
wonder if I do not miss a well known obvious much better idea).
 
C

Chris Angelico

So back to my original question: A Queue and two threads (producer/consumer)
seems a good answer to my problem, or is there a better way to solve it?
(again, I'm really a beginner, so I made up this solution, but really wonder
if I do not miss a well known obvious much better idea).

Well, the first thing I'd try is simply asking for more data when
you're ready for it - can you get five seconds' of data all at once?
Obviously this won't work if your upstream buffers only a small
amount, in which case your thread is there to do that buffering; also,
if you can't absolutely *guarantee* that you can process the data
quickly enough, every time, then you need to use the queue to buffer
that.

But otherwise, it sounds like a quite reasonable way to do things.

ChrisA
 
L

lgabiot

Le 12/05/14 07:58, Chris Angelico a écrit :
Well, the first thing I'd try is simply asking for more data when
you're ready for it - can you get five seconds' of data all at once?
Obviously this won't work if your upstream buffers only a small
amount, in which case your thread is there to do that buffering; also,
if you can't absolutely *guarantee* that you can process the data
quickly enough, every time, then you need to use the queue to buffer
that.

But otherwise, it sounds like a quite reasonable way to do things.

ChrisA

Ok, thanks a lot!
 
S

Stefan Behnel

lgabiot, 12.05.2014 07:33:
Le 11/05/14 17:40, lgabiot a écrit :


I guess this solution might probably not work, since it would mean that the
calculation should be quick enough so it wouldn't last longer than 1 sample
(1/48000 s for instance), since while doing the calculation, no audio would
be ingested (unless pyAudio possess some kind of internal concurrency system).
Which leads me to think that a buffer (queue) and separate threads
(producer and consumer) are necessary for this task.

This sounds like a use case for double buffering. Use two buffers, start
filling one. When it's full, switch buffers, start filling the second and
process the first. When the second is full, switch again.

Note that you have to make sure that the processing always terminates
within the time it takes to fill the other buffer. If you can't assure
that, however, you have a problem anyway and should see if there's a way to
improve your algorithm.

If the "fill my buffer" call in PyAudio is blocking (i.e. if it returns
only after filling the buffer), then you definitely need two threads for this.

But AFAIK the python GIL (and in smaller or older computers that have only
one core) does not permit true paralell execution of two threads.

Not for code that runs in the *interpreter", but it certainly allows I/O
and low-level NumPy array processing to happen in parallel, as they do not
need the interpreter.

Stefan
 
L

lgabiot

Le 12/05/14 08:13, Stefan Behnel a écrit :
This sounds like a use case for double buffering. Use two buffers, start
filling one. When it's full, switch buffers, start filling the second and
process the first. When the second is full, switch again.

Note that you have to make sure that the processing always terminates
within the time it takes to fill the other buffer. If you can't assure
that, however, you have a problem anyway and should see if there's a way to
improve your algorithm.

If the "fill my buffer" call in PyAudio is blocking (i.e. if it returns
only after filling the buffer), then you definitely need two threads for this.



Not for code that runs in the *interpreter", but it certainly allows I/O
and low-level NumPy array processing to happen in parallel, as they do not
need the interpreter.

Stefan
Thanks for your answer.

If I follow your explanations, I guess I have to review my understanding
of python execution model (I have to admit it is quite crude anyway).

In my understanding, without threads, I would have two functions:
- get_audio() would get the 5 seconds of audio from Pyaudio
- process_audio() would process the 5 seconds of audio

the main code would be roughly executing this:
while(True)
get_audio()
process_audio()

so since the audio is a live feed (which makes a difference, say, with
an audio file analyser program), the get_audio() part must take 5
seconds to execute. (but most probably the processor stays still most of
the time during the get_audio() part).
then once get_audio() is done, process_audio() begins.
Process_audio will take some time. If that time is greater that the
times it takes for the next audio sample to arrive, I have a problem.
(which you already explained differently maybe with:
If the "fill my buffer" call in PyAudio is blocking (i.e. if it returns
only after filling the buffer), then you definitely need two threads
for this.
)

So if I follow you, if the Pyaudio part is "Non-blocking" there would be
a way to make it work without the two threads things. I'm back to the
Pyaudio doc, and try to get my head around the callback method, which
might be the good lead.
 
L

lgabiot

Le 12/05/14 10:14, lgabiot a écrit :
So if I follow you, if the Pyaudio part is "Non-blocking" there would be
a way to make it work without the two threads things. I'm back to the
Pyaudio doc, and try to get my head around the callback method, which
might be the good lead.

So far, if I understand correctly PyAudio, the callback method is a way
to do some sort of computing on a Pyaudio stream, by declaring a
function (the "callback" one) at stream opening time, the callback
function being executed in a separate thread (as per the Pyaudio
documentation)...
Still investigating.
 
S

Sturla Molden

But AFAIK the python GIL (and in smaller or older computers that have
only one core) does not permit true paralell execution of two threads. I
believe it is quite like the way multiple processes are handled by an OS
on a single CPU computer: process A has x CPU cycles, then process B has
y CPU cycles, etc...

Python threads are native OS threads. The GIL serializes access to the
Python interpreter.

If your thread is waiting for i/o or running computations in C or
Fortran (e.g. with NumPy), it does not need the Python interpreter.

Scientists and engineers use Python threads for "true parallel
processing" all the time. The FUD you will find about the GIL is written
by people who don't fully understand the issue.

So in my case, I must have a way to make sure that:
thread 1 (which gets audio from Pyaudio and put() it in the Queue) is
not interrupted long enough to miss a sample.

Here you are mistaken. The DMA controller takes care of the audio i/o.
Your audio acquisition thread is asleep while its buffer fills up. You
don't miss a sample because your thread is interrupted.

You do, however, have to make sure your thread don't block on the write
to the Queue (use block=False in the call to Queue.put), but it is not a
"GIL issue".

In your case you basically have on thread waiting for the DMA controller
to fill up a buffer and another doing computations in NumPy. Neither
needs the GIL for most of their work.

If you are worried about the GIL you can always use processes
(multiprocessing, subprocess, or os.fork) instead of threads.

Sturla
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,056
Messages
2,570,441
Members
47,125
Latest member
MDBT

Latest Threads

Top