Using reserved space in a vector defined?

F

Fred Zwarts

I have an application in which data from one place needs to copied to another place.
These places can be network sockets, or a files on disk, etc.
In all cases the data consists of unformatted unsigned 32-bit integers.
Each record starts with a 32-bit integer with the record size.
Then the record with the data follows.
Then another count follows, or the end of the data.
The records usually have up to a few thousand integers,
but occasionally it can be very much larger.
For efficiency, the records are read as a block.
Reading the data elements one by one would significantly reduce the I/O performance.

The application uses a vector <uint32_t> as a buffer for reading and writing the data.
First it reads the count.
Then it can do two things, either resize the vector if it is too small,
or increase the capacity of the vector with the reserve function.
Then it reads the record into the vector.
Then it writes the count and writes the data.

The question is about resizing the buffer.
Using a resize has the disadvantage that all new elements of the vector are initialized,
which is an unnecessary operation, because these values will be overwritten immediately
by the subsequent read operation.
Alternatively, the vector size is kept at 1, and only the capacity of the vector is increased.
The address of the first element of the vector and the size of the record is supplied
to the read function and after that to the write function.
This means data elements of the vector are used as a buffer that do not belong to its defined size.
However, the standard says that all data elements are contiguous in memory,
so, this means that also the reserved space must be contiguous.
So, logically, there should not be a problem to use this reserved space in this way.

Is there a reason to think that there are environments were this does not work?
Is there another way to resize the buffer, without initializing all new elements?
 
G

Gert-Jan de Vos

Is this initialization a bottleneck in the application?

If you don't want initialization, then you probably shouldn't use
vector. From the problem description, it should be pretty
straightforward to write a class that does what you need, without the
heavyweight features that vector provides.

I had cases where vector's initialization was a bottleneck. I changed
the buffer to
boost::scoped_array<T> buffer(new T[size]);
 
A

AnonMail2005

I have an application in which data from one place needs to copied to another place.
These places can be network sockets, or a files on disk, etc.
In all cases the data consists of unformatted unsigned 32-bit integers.
Each record starts with a 32-bit integer with the record size.
Then the record with the data follows.
Then another count follows, or the end of the data.
The records usually have up to a few thousand integers,
but occasionally it can be very much larger.
For efficiency, the records are read as a block.
Reading the data elements one by one would significantly reduce the I/O performance.

The application uses a vector <uint32_t> as a buffer for reading and writing the data.
First it reads the count.
Then it can do two things, either resize the vector if it is too small,
or increase the capacity of the vector with the reserve function.
Then it reads the record into the vector.
Then it writes the count and writes the data.

The question is about resizing the buffer.
Using a resize has the disadvantage that all new elements of the vector are initialized,
which is an unnecessary operation, because these values will be overwritten immediately
by the subsequent read operation.
Alternatively, the vector size is kept at 1, and only the capacity of the vector is increased.
The address of the first element of the vector and the size of the record is supplied
to the read function and after that to the write function.
This means data elements of the vector are used as a buffer that do not belong to its defined size.
However, the standard says that all data elements are contiguous in memory,
so, this means that also the reserved space must be contiguous.
So, logically, there should not be a problem to use this reserved space in this way.

Is there a reason to think that there are environments were this does not work?
Is there another way to resize the buffer, without initializing all new elements?

I think you're fine - especially since you already have a size of
one. Any compliant C++ standard library should work.

But if you're using vector *just* for the memory management, you may
want to read Gert-Jan de Vos' reply below.

HTH
 
J

James Kanze

Fred Zwarts wrote:
It will work fine in all compliant environments.

It's undefined behavior. Although in practice, it's difficult
to imagine an implementation where it wouldn't work, there's
certainly no guarantee that it will work, and there is a
guarantee that anyone reading the code will be thoroughly
confused.
 
J

James Kanze

Read through the archive (if Google's is working well enough
to do so). The problem is technical (only access via port 80)
not monetary.

But that will soon change:). I'm in the midst of a total
reorganization of my environment, and I will have a correct news
access in the end. (But getting a comfortable development
environment on my new machine has precedence. And since it is
purely Windows, that's not an easy issue to solve.)
 
A

Alf P. Steinbach

* James Kanze:
But that will soon change:). I'm in the midst of a total
reorganization of my environment, and I will have a correct news
access in the end. (But getting a comfortable development
environment on my new machine has precedence. And since it is
purely Windows, that's not an easy issue to solve.)

Hiya James. I know a thing or two about Windows. But perhaps not about the tools
you're using -- I've been out of the loop for some years. Anyway, for
Windows-specific issues I'd be glad to help (if I can).


Cheers,

- Alf
 
J

James Kanze

* James Kanze:

[...]
Hiya James. I know a thing or two about Windows. But perhaps
not about the tools you're using -- I've been out of the loop
for some years. Anyway, for Windows-specific issues I'd be
glad to help (if I can).

I already owe you a lot for your help some time back, but where
I'm now working, everyone is more or less up to speed on
Windows; I've not yet reached the point where I'm asking
questions they can't answer. And, of course, I've installed
Cygwin on my machine, which means that there are a lot of little
things that I can do faster than they can:).
 
J

James Kanze

"Undefined behavior" means that the compiler can add boundary
checks to the vector indexing, which some compilers do in
debug mode. Thus the program will fail if you try to index
out-of-bounds, even if it would be on reserved space. Thus you
cannot trust that the trick will work with all compilers in
all configurations.

If I understood correctly, however, he's using the [] operator
on the address returned by &v[0]. I'm not too sure of the
standard here; I sort of think that such checks would have to
involve the underlying allocated memory, and not what vector
knows about it. (On the other hand, vector is free to do
whatever it wants with that underlying memory, e.g. overwrite it
with nonsense patterns in every function. I just can't imagine
an implementation which does, however.)
 
D

Daniel Pitts

Jerry said:
Read through the archive (if Google's is working well enough to do
so). The problem is technical (only access via port 80) not monetary.
Ah, that is frustrating indeed. Though, newsrazor have a server which
listens on port 80, so it may still be a possibility:
<http://www.newsrazor.net/faq.php#ServerSettings>

I use the standard settings, so I'm not sure how well it works.
 
I

Ian Collins

James said:
I already owe you a lot for your help some time back, but where
I'm now working, everyone is more or less up to speed on
Windows; I've not yet reached the point where I'm asking
questions they can't answer. And, of course, I've installed
Cygwin on my machine, which means that there are a lot of little
things that I can do faster than they can:).

Install VirtualBox, load a copy of your OS of choice, configure some
shared folders and you will soon be back to normal!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,770
Messages
2,569,583
Members
45,072
Latest member
trafficcone

Latest Threads

Top