Handling large amounts of data

W

Wayne Marsh

Hi all.

I am working on an audio application which needs reasonably fast access to
large amounts of data. For example, the program may load a 120 second
stereo sound sample stored at 4bytes per sample, which would mean over
40MB of data at a 44100Hz sampling rate.

Now, what would be a good way to handle all of this data? Ideally, for the
sake of my own sanity and the algorithms within directly functional
portions of the code, I'd like to interface with the data via normal array
syntax. Are arrays of this size really suitable, or would there be a
better way? Writing the data to disk and then memory mapping the files
seemed like an option, although I suspect that would be analogous to the
operating system's virtual memory system.

Any ideas?
 
J

jacob navia

Wayne said:
Hi all.

I am working on an audio application which needs reasonably fast access
to large amounts of data. For example, the program may load a 120 second
stereo sound sample stored at 4bytes per sample, which would mean over
40MB of data at a 44100Hz sampling rate.

Now, what would be a good way to handle all of this data? Ideally, for
the sake of my own sanity and the algorithms within directly functional
portions of the code, I'd like to interface with the data via normal
array syntax. Are arrays of this size really suitable, or would there be
a better way? Writing the data to disk and then memory mapping the files
seemed like an option, although I suspect that would be analogous to the
operating system's virtual memory system.

Any ideas?

It depends on how much memory is there in the computer. If you work
in a PC environment, machines now come routinely equipped with 1GB
of RAM, and 40MB is nothing. Just load it into RAM and use it as an
array. The VM system will do the paging for you if your OS is UNIX
or windows.

It would be surprising if you wanted to process all this data in an
embedded system with a few K of RAM anyway.

jacob
 
M

Mac

Hi all.

I am working on an audio application which needs reasonably fast access to
large amounts of data. For example, the program may load a 120 second
stereo sound sample stored at 4bytes per sample, which would mean over
40MB of data at a 44100Hz sampling rate.

Now, what would be a good way to handle all of this data? Ideally, for the
sake of my own sanity and the algorithms within directly functional
portions of the code, I'd like to interface with the data via normal array
syntax. Are arrays of this size really suitable, or would there be a
better way? Writing the data to disk and then memory mapping the files
seemed like an option, although I suspect that would be analogous to the
operating system's virtual memory system.

Any ideas?

On a PC or similar, 40 MB isn't really that much data nowadays. I would
start by just reading the whole file into an array and accessing the data
via the array.

If that approach proves problematic, you can go back and try to use memory
mapping (which isn't really on topic here) or what have you. This should
have little effect on the rest of your program, so it's not like you lose
anything if you have to change to the memory mapped approach.

If you are not envisioning a PC-type environment, then that is another
story.

--Mac
 
M

Malcolm

Wayne Marsh said:
I am working on an audio application which needs reasonably fast access to
large amounts of data. For example, the program may load a 120 second
stereo sound sample stored at 4bytes per sample, which would mean over
40MB of data at a 44100Hz sampling rate.

Now, what would be a good way to handle all of this data?
There isn't really an answer. It depends on the exact platform, and nature
of calculations performed.
Some computers will happily chomp 40MB of data and represent it as a flat
array, whilst others will struggle. Often there is no point trying to
implement a virtual memory system of your own if the OS will do it for you.
In other cases there is a point. For instance if you need to access data in
5K chunks one megabyte apart, then a clever system of memory allocation will
beat the standard swap space algorithm hands down. However your data is
audio, so it is unlikely you want to do this.
The question is, do you really need random access over the whole 40MB array,
or can you treat the data as streamed? If you can treat it as streamed, then
it is probably best not to waste all that memory, unless you know that the
computer you are running on has the capacity to handle it, and the memory
would otherwise simply go unused.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,830
Latest member
ZADIva7383

Latest Threads

Top