Need speed increase while reading large chunk of data.

D

Darsant

I'm currently reading 1-n number of binary files, each with 3 different
arrays of floats containing about 10,000 values a piece for a total of
about 30,000 values per file.

I'm looking for a way to load them all into memory. I've tried using
vector pushback with reserving, but it was horribly slow. The current
method I am using is upon opening the file and reading the number of
values, resizing the vectors (I have 3, one for each data channel),
then inserting the [] operator.

vecLeftChannel = DataBuffer.

This still takes a while to loop through the files, and I was wondering
if anyone knew a quicker way to load dynamic data like this without
using float arrays and reallocating manually.

Thanks!
Josh McFarlane
 
A

Alan Johnson

Darsant said:
I'm currently reading 1-n number of binary files, each with 3 different
arrays of floats containing about 10,000 values a piece for a total of
about 30,000 values per file.

I'm looking for a way to load them all into memory. I've tried using
vector pushback with reserving, but it was horribly slow. The current
method I am using is upon opening the file and reading the number of
values, resizing the vectors (I have 3, one for each data channel),
then inserting the [] operator.

vecLeftChannel = DataBuffer.

This still takes a while to loop through the files, and I was wondering
if anyone knew a quicker way to load dynamic data like this without
using float arrays and reallocating manually.

Thanks!
Josh McFarlane


std::vector is guaranteed to be O(n) amortized if you are only adding to
the end of the vector. You say you tried that "with reserving". Do you
mean that you called reserve once prior to adding to the array (good
usage), or periodically as you read more values (bad usage, can result
in non-linear running time)?

If you don't mind temporarily more memory, you could read everything
into a std::list, and then construct a vector from that with something
similar to:
std::vector<float> v(mylist.begin(), mylist.end()) ;
mylist.clear() ;

This might be faster than adding directly to the vector (though by at
most a constant factor).

Probably the fastest possible way to access your data is to use OS
specific mechanisms for creating a memory mapped file (see open and mmap
for Unix variants, or CreateFile, CreateFileMapping, and MapViewOfFile
for Win32).

-Alan
 
W

wkaras

Darsant said:
I'm currently reading 1-n number of binary files, each with 3 different
arrays of floats containing about 10,000 values a piece for a total of
about 30,000 values per file.

I'm looking for a way to load them all into memory. I've tried using
vector pushback with reserving, but it was horribly slow. The current
method I am using is upon opening the file and reading the number of
values, resizing the vectors (I have 3, one for each data channel),
then inserting the [] operator.

vecLeftChannel = DataBuffer.

This still takes a while to loop through the files, and I was wondering
if anyone knew a quicker way to load dynamic data like this without
using float arrays and reallocating manually.


I think there's a function like "seek" or "fseek" or something
to seek to the end of file. Either the seek function returns
a byte offset, or you call a "position" or "pos" function to
get the byte offset, which is effectively the size of the file
in bytes. You then resize the vector to
sum_of_file_sizes / sizeof(float). You should then get reasonable
performance reading in the floats one at a time using buffered
I/O.
 
D

Darsant

I think there's a function like "seek" or "fseek" or something
to seek to the end of file. Either the seek function returns
a byte offset, or you call a "position" or "pos" function to
get the byte offset, which is effectively the size of the file
in bytes. You then resize the vector to
sum_of_file_sizes / sizeof(float). You should then get reasonable
performance reading in the floats one at a time using buffered
I/O.

That's what I'm doing right now.

For each file, I read the header of information to a temporary header
structure. The header tells me the amount of data in each channel. I
then reserve size+newsamples for the vector, then push_back the new
readings from the temporary read buffer.
 
R

Rapscallion

Darsant said:
I'm currently reading 1-n number of binary files, each with 3 different
arrays of floats containing about 10,000 values a piece for a total of
about 30,000 values per file.

First you should measure (e.g. with clock()) where the time is spent.
Then consider alternatives.
 
W

wkaras

Darsant said:
That's what I'm doing right now.

For each file, I read the header of information to a temporary header
structure. The header tells me the amount of data in each channel. I
then reserve size+newsamples for the vector, then push_back the new
readings from the temporary read buffer.

In a typical implementation of the "vector" template, I think
there is, in an instance of vector<float>, a private pointer to
float. This private pointer points to an array in dynamically
allocated storage whose number of (float) elements is greater
than or equal to the value returned by vector<float>::size().
What happens when you call push_back, and the number of floats in
the dynamic array is exactly equal to size()? push_back will:

1) dynamically allocate another array of floats whose dimension
is size() + Delta (Delta being a positive integral constant chosen
by the STL implementation).

2) copy the old array of floats into the newly allocated array of
floats.

3) copy the float being pushed_back to the entry at offset size()
in the newly allocated array, and set size() to size()+1.

4) free the old array of floats, and set the private pointer to the
new array of floats.

the resize() member function does something similar, except that
the caller controls the new size of the private dynamically allocated
array. So that is why it help alot to get the sizes of all the files,
add them up, resize the vector to the total size, then copy the floats
in one at a time. You may avoid alot of repeated unnecessary heap
allocation (which can be slow) and copying of the private array.

To put it in high level terms, I think that the Standard says that
the worst-case time complexity of both resize and push_back is
O(n). So it makes sense to do one resize rather than n push_backs.
 
A

Alan Johnson

In a typical implementation of the "vector" template, I think
there is, in an instance of vector<float>, a private pointer to
float. This private pointer points to an array in dynamically
allocated storage whose number of (float) elements is greater
than or equal to the value returned by vector<float>::size().
What happens when you call push_back, and the number of floats in
the dynamic array is exactly equal to size()? push_back will:

1) dynamically allocate another array of floats whose dimension
is size() + Delta (Delta being a positive integral constant chosen
by the STL implementation).

2) copy the old array of floats into the newly allocated array of
floats.

3) copy the float being pushed_back to the entry at offset size()
in the newly allocated array, and set size() to size()+1.

4) free the old array of floats, and set the private pointer to the
new array of floats.

the resize() member function does something similar, except that
the caller controls the new size of the private dynamically allocated
array. So that is why it help alot to get the sizes of all the files,
add them up, resize the vector to the total size, then copy the floats
in one at a time. You may avoid alot of repeated unnecessary heap
allocation (which can be slow) and copying of the private array.

To put it in high level terms, I think that the Standard says that
the worst-case time complexity of both resize and push_back is
O(n). So it makes sense to do one resize rather than n push_backs.

23.2.4.1: A vector is a kind of sequence that supports random access
iterators. In addition, it supports (amortized) *constant time* insert
and erase operations at the end;

(emphasis mine)


If what you call "Delta" is a constant, then your performance for
inserting n items at the end of the vector will be bounded by O(n^2), or
O(n) amortized for each insert.

A better solution (that I imagine most STL implementations use) is to
double the capactiy of the vector every time you run out of space (and
half the capacity when the size hits 1/4). This is provably bounded by
O(n) for n insertions at the end, or O(1) per insertion.

-Alan
 
D

Darsant

Thanks for everyone's help.

I've got the loading procedure in my class working fairly efficently
now (4375ms per 342000 floats).

Question: If I am reading to a buffer, can I read the buffer of array
of floats directly into my vector, or is the memory management
different in that I have to manually pushback an array buffer into the
vector? It's more of a curiosity question that might save me a few
hundred ms.

Unfortunately, due the the processing algorithm it is passed off to
(not under my control), I have to then create a temporary array and
pass that array to the algorithm from the vector. Allocating the array
takes roughly 1000ms and copying takes another 688ms. Due to a glitch
in Visual C++, I can't use a reference the constant vector returned by
my class (converts pointers to _w64 vector type instead of just a
pointer to a vector array, so also have to also copy the vector.)

Anyway, thanks again for everyones help!
 
W

wkaras

Alan said:
(e-mail address removed) wrote: ....

23.2.4.1: A vector is a kind of sequence that supports random access
iterators. In addition, it supports (amortized) *constant time* insert
and erase operations at the end;

(emphasis mine)

If what you call "Delta" is a constant, then your performance for
inserting n items at the end of the vector will be bounded by O(n^2), or
O(n) amortized for each insert.
A better solution (that I imagine most STL implementations use) is to
double the capactiy of the vector every time you run out of space (and
half the capacity when the size hits 1/4). This is provably bounded by
O(n) for n insertions at the end, or O(1) per insertion.

Well, now, they don't really clarify as to *what* to amortized over,
do they? But I think you're right, a constant delta is probably
not a good idea. I think your analysis applies for increasing the
array size by multiplying by K for any K > 1. I'd guess a good
implementation would use a value of K that's closer to 1 than
2.

Maybe another way to implement a vector<T> is as an array of pointers
to arrays of T whose dimension is some power of 2. So
the internal accessing of a vector element would look like
a[i >> K][i & ((1 << K) - 1)] . No copying needed on size
changes.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top