Refactoring question

Brian · Dec 15, 2009

The following functions come from this file --
http://webEbenezer.net/Buffer.hh.

void
Receive(const void* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
SendStoredData();

if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}

memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}

void
Put(unsigned char byte)
{
if (index_ >= bufsize_) {
SendStoredData();
}

buf_[index_] = byte;
++index_;
}

void
Receive(uint8_t value)
{
Put(value);
}

void
Receive(uint16_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
if (least_significant_first == otherFormat_) {
Put( (value ) & 0xFF );
Put( (value >> 8) & 0xFF );
} else {
Put( (value >> 8) & 0xFF );
Put( (value ) & 0xFF );
}
}
}

void
Receive(uint32_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
if (least_significant_first == otherFormat_) {
Put( (value ) & 0xFF );
Put( (value >> 8) & 0xFF );
Put( (value >> 16) & 0xFF );
Put( (value >> 24) & 0xFF );
} else {
Put( (value >> 24) & 0xFF );
Put( (value >> 16) & 0xFF );
Put( (value >> 8) & 0xFF );
Put( (value ) & 0xFF );
}
}
}

void
Receive(int8_t value)
{
uint8_t tmp = value;
Receive(tmp);
}

void
Receive(int16_t value)
{
uint16_t tmp = value;
Receive(tmp);
}

void
Receive(int32_t value)
{
uint32_t tmp = value;
Receive(tmp);
}

The link that I mentioned also has Receive functions for
uint64_t and int64_t. I was happy with those functions
until I remembered that I aim to support an optimization
when basic (numeric) data is stored contiguously in a
std::vector, Boost Array or regular array. To support
that I added the following function:

template <typename T>
void
ReceiveBlock(T const* data, unsigned int elements)
{
if (thisFormat_ == otherFormat_) {
Receive(data, elements * sizeof(T));
} else {
for (unsigned int ii = 0; ii < elements; ++ii) {
Receive(*(data + ii));
}
}
}

If for example, a vector<uint16_t> is going to be marshalled,
some code like this will be produced, where abt1 is a
reference to a const vector<uint16_t>:

// headCount is set to the number of elements in the
// vector and is marshalled first.
// Then the elements are marshalled with:

buf->ReceiveBlock(&(*abt1.begin()), headCount);

The member variables thisFormat_ and otherFormat_ used
in ReceiveBlock might be better named thisByteOrder_ and
otherByteOrder_. If ReceiveBlock determines that the
byte orders are different, it is going to call a Receive
function which is once again going to check the two
byte orders. I'd like to rework this code so that check
isn't performed unnecessarily. One idea I had was to
overload the Put function for uint16_t, uint32_t and
uint64_t. For some reason I'm not really comfortable
with that, (I'm not sure if some compilers may do
things a little differently with the shifting and &'ing and
match a function I'm not expecting.) although I'm fine
with Receive being overloaded like it is. I'd like to get
some suggestions on this and any of the above code.
Thanks in advance.

Brian Wood
http://webEbenezer.net

James Kanze · Dec 15, 2009

The following functions come from this file
--http://webEbenezer.net/Buffer.hh.

void
Receive(const void* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
SendStoredData();

if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}

memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}

void
Put(unsigned char byte)
{
if (index_ >= bufsize_) {
SendStoredData();
}

buf_[index_] = byte;
++index_;
}

void
Receive(uint8_t value)
{
Put(value);
}

I'm a little bit confused by your names. Are you receiving
(getting) or sending (putting)?

void
Receive(uint16_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
if (least_significant_first == otherFormat_) {
Put( (value ) & 0xFF );
Put( (value >> 8) & 0xFF );
} else {
Put( (value >> 8) & 0xFF );
Put( (value ) & 0xFF );
}
}
}

I'd expose a little bit more of the buffer details here,
perhaps with a "reserve" function, so that you don't have to
test if there's place for each byte. Note, in fact, that your
bufferization is different if the formats are the same. Which
do you want: to ensure that both bytes are in the same buffer,
or that the buffer is as full as possible?

Having decided the buffering strategy, there's no point in the
first if. It just adds complexity, for no gain.

And I'd definitly call this function Send, and not Receive.

void
Receive(uint32_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
if (least_significant_first == otherFormat_) {
Put( (value ) & 0xFF );
Put( (value >> 8) & 0xFF );
Put( (value >> 16) & 0xFF );
Put( (value >> 24) & 0xFF );
} else {
Put( (value >> 24) & 0xFF );
Put( (value >> 16) & 0xFF );
Put( (value >> 8) & 0xFF );
Put( (value ) & 0xFF );
}
}
}

Same comments as above.

void
Receive(int8_t value)
{
uint8_t tmp = value;
Receive(tmp);
}

void
Receive(int16_t value)
{
uint16_t tmp = value;
Receive(tmp);
}

void
Receive(int32_t value)
{
uint32_t tmp = value;
Receive(tmp);
}

The link that I mentioned also has Receive functions for
uint64_t and int64_t. I was happy with those functions until
I remembered that I aim to support an optimization when basic
(numeric) data is stored contiguously in a std::vector, Boost
Array or regular array. To support that I added the following
function:

template <typename T>
void
ReceiveBlock(T const* data, unsigned int elements)
{
if (thisFormat_ == otherFormat_) {
Receive(data, elements * sizeof(T));
} else {
for (unsigned int ii = 0; ii < elements; ++ii) {
Receive(*(data + ii));
}
}
}

Again, the two branches of the if implement different buffering
strategies. Before going any further, I think you have to
define this.

If for example, a vector<uint16_t> is going to be marshalled,
some code like this will be produced, where abt1 is a
reference to a const vector<uint16_t>:

// headCount is set to the number of elements in the
// vector and is marshalled first.
// Then the elements are marshalled with:

buf->ReceiveBlock(&(*abt1.begin()), headCount);

The member variables thisFormat_ and otherFormat_ used in
ReceiveBlock might be better named thisByteOrder_ and
otherByteOrder_. If ReceiveBlock determines that the byte
orders are different, it is going to call a Receive function
which is once again going to check the two byte orders. I'd
like to rework this code so that check isn't performed
unnecessarily.

Any decisions concerning data format should be negotiated in the
connection protocol. They shouldn't be evaluated on the fly,
nor change during a connection.

One idea I had was to overload the Put function for uint16_t,
uint32_t and uint64_t. For some reason I'm not really
comfortable with that, (I'm not sure if some compilers may do
things a little differently with the shifting and &'ing and
match a function I'm not expecting.)

If you're worried about which function is chosen by overload
resolution, you can always add a cast to ensure an exact match.
But if you're worried about which function is chosen by overload
resolution, the functions probably shouldn't have the same name
to begin with.

although I'm fine with Receive being overloaded like it is.
I'd like to get some suggestions on this and any of the above
code.

I think you have to define a higher level protocol to begin
with. (And although it's possible, and I've seen at least some
formats which do so, I'm not convinced that there's any
advantage in supported different representations.)

Brian · Dec 15, 2009

I'm a little bit confused by your names. Are you receiving
(getting) or sending (putting)?

I'd expose a little bit more of the buffer details here,
perhaps with a "reserve" function, so that you don't have to
test if there's place for each byte. Note, in fact, that your
bufferization is different if the formats are the same.

That's true, but I'm not sure it matters.

Which
do you want: to ensure that both bytes are in the same buffer,
or that the buffer is as full as possible?

I guess adding a Reserve function would be one way to
address this. I'm not sure the buffering has to be uniform,
but perhaps a Reserve function would be useful in avoiding
the check of overflow with each byte.

Having decided the buffering strategy, there's no point in the
first if. It just adds complexity, for no gain.

I think that if statement helps performance-wise.
Here's some text from my site:

"With -O3 and 500,000 elements, the Boost Serialization version
is between 2.3 and 2.8 times slower than the Ebenezer version...
When no formatting is needed because the reader has the same
format as the writer, the numbers are near the high end, (2.8),
of the range. When formatting is needed because the two
machines have different byte orders, the numbers are near the
low end of the range." That is found on this page --
http://webEbenezer.net/comparison.html.

The test I'm describing is writing data to the hard drive.
I pretend in the one case that the formats are different and
measure the time both ways.

And I'd definitly call this function Send, and not Receive.

I picked up the term "put" from some your previous posts I
think. Well, the buffer is receiving data in order to send
it. That's why I call it Receive. I kind of lazily added
the Put function without thinking about how it fits with
the bigger picture. One previous co-worker would probably
ask if I'm thinking about things object orientedly or
stream orientedly. I aim to be object oriented in this.
At a higher level, messages are sent and received. And
when a message is sent that is implemented by calling the
buffer's Receive functions.

Same comments as above.

Again, the two branches of the if implement different buffering
strategies. Before going any further, I think you have to
define this.

I know, but so what? The second form is needed
for correctness and both forms put the same amount
of information onto the stream and the ints themselves
are in the same sequence (but with different byte order)
in either case. In "Effective TCP/IP Programming" it
says, "There is no such thing as a 'packet' for a TCP
application. An application with a design that depends
in any way on how TCP packetizes the data needs to be
rethought." The stream has to be correct and in both
cases I think it is. If I didn't perceive some utility
to a Reserve function, my view would be that that's just
the way it is. In this case though, a Reserve function
might try to reserve more bytes than are available in the
buffer, so it is kind of a headache. The purpose of the
buffer is defeated by being overly concerned with making
things identical in every situation. My view is to just
use the reserve idea at the lower level when 2, 4, or 8
bytes are being handled and let this be potentially
different. Someone could create a buffer with 5 bytes
in it, but that also defeats the purpose of having a
buffer. I could check in the constructor that the size
is at least 8.

Any decisions concerning data format should be negotiated in the
connection protocol. They shouldn't be evaluated on the fly,
nor change during a connection.

The decisions are determined in the connection protocol.
I've thought about making this a template parameter, but
didn't like what I was coming up with when I looked at
that. The thisFormat_ isn't changable after the buffer
is constructed. The otherFormat_ is but that is to permit
the same buffer to be used to handle requests from both
little and big endian users. The way things are set up
now, it is possible that a programming error could lead
to incorrectly setting the otherFormat_ in the middle of
a connection, but that doesn't seem like a likely problem
to me.

If you're worried about which function is chosen by overload
resolution, you can always add a cast to ensure an exact match.
But if you're worried about which function is chosen by overload
resolution, the functions probably shouldn't have the same name
to begin with.

I think you have to define a higher level protocol to begin
with. (And although it's possible, and I've seen at least some
formats which do so, I'm not convinced that there's any
advantage in supported different representations.)

I've no idea what you mean by that last sentence.

Brian Wood
http://webEbenezer.net

Maxim Yegorushkin · Dec 16, 2009

The following functions come from this file --
http://webEbenezer.net/Buffer.hh.

void
Receive(const void* data, unsigned int dlen)
{
if (dlen> bufsize_ - index_) {
SendStoredData();

if (dlen> bufsize_) {
PersistentWrite(data, dlen);
return;
}
}

memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}

void
Put(unsigned char byte)
{
if (index_>= bufsize_) {
SendStoredData();
}

buf_[index_] = byte;
++index_;
}

void
Receive(uint8_t value)
{
Put(value);
}

void
Receive(uint16_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
if (least_significant_first == otherFormat_) {
Put( (value )& 0xFF );
Put( (value>> 8)& 0xFF );
} else {
Put( (value>> 8)& 0xFF );
Put( (value )& 0xFF );
}
}
}

void
Receive(uint32_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
if (least_significant_first == otherFormat_) {
Put( (value )& 0xFF );
Put( (value>> 8)& 0xFF );
Put( (value>> 16)& 0xFF );
Put( (value>> 24)& 0xFF );
} else {
Put( (value>> 24)& 0xFF );
Put( (value>> 16)& 0xFF );
Put( (value>> 8)& 0xFF );
Put( (value )& 0xFF );
}
}
}

void
Receive(int8_t value)
{
uint8_t tmp = value;
Receive(tmp);
}

void
Receive(int16_t value)
{
uint16_t tmp = value;
Receive(tmp);
}

void
Receive(int32_t value)
{
uint32_t tmp = value;
Receive(tmp);
}

All integer byte order issues can be handled by one function. Say, if
your binary byte order is little-endian, so that you only reverse
integers on big-endian architectures:

#include <endian.h>

// generic binary write function
void PutBuffer(void const* buf, size_t buf_size);

template<class Integer>
void PutInteger(Integer value)
{
char* p = reinterpret_cast<char*>(&value);
#if __BYTE_ORDER != __LITTLE_ENDIAN
std::reverse(p, p + sizeof value);
#endif
PutBuffer(p, sizeof value);
}

Just make sure you don't call PutInteger on non-integers.

Brian · Dec 16, 2009

That's true, but I'm not sure it matters.

I guess adding a Reserve function would be one way to
address this. I'm not sure the buffering has to be uniform,
but perhaps a Reserve function would be useful in avoiding
the check of overflow with each byte.

I think that if statement helps performance-wise.
Here's some text from my site:

"With -O3 and 500,000 elements, the Boost Serialization version
is between 2.3 and 2.8 times slower than the Ebenezer version...
When no formatting is needed because the reader has the same
format as the writer, the numbers are near the high end, (2.8),
of the range. When formatting is needed because the two
machines have different byte orders, the numbers are near the
low end of the range." That is found on this page --http://webEbenezer..net/comparison.html.

The test I'm describing is writing data to the hard drive.
I pretend in the one case that the formats are different and
measure the time both ways.

I picked up the term "put" from some your previous posts I
think. Well, the buffer is receiving data in order to send
it. That's why I call it Receive. I kind of lazily added
the Put function without thinking about how it fits with
the bigger picture. One previous co-worker would probably
ask if I'm thinking about things object orientedly or
stream orientedly. I aim to be object oriented in this.
At a higher level, messages are sent and received. And
when a message is sent that is implemented by calling the
buffer's Receive functions.

I know, but so what? The second form is needed
for correctness and both forms put the same amount
of information onto the stream and the ints themselves
are in the same sequence (but with different byte order)
in either case. In "Effective TCP/IP Programming" it
says, "There is no such thing as a 'packet' for a TCP
application. An application with a design that depends
in any way on how TCP packetizes the data needs to be
rethought." The stream has to be correct and in both
cases I think it is. If I didn't perceive some utility
to a Reserve function, my view would be that that's just
the way it is. In this case though, a Reserve function
might try to reserve more bytes than are available in the
buffer, so it is kind of a headache. The purpose of the
buffer is defeated by being overly concerned with making
things identical in every situation. My view is to just
use the reserve idea at the lower level when 2, 4, or 8
bytes are being handled and let this be potentially
different. Someone could create a buffer with 5 bytes
in it, but that also defeats the purpose of having a
buffer. I could check in the constructor that the size
is at least 8.

void
Put(unsigned char byte)
{
buf_[index_] = byte;
++index_;
}

void
Reserve(unsigned int size)
{
if (size > bufsize_ - index_) {
SendStoredData();
}
}

void
Receive(uint32_t value)
{
Reserve(sizeof value);
if (least_significant_first == otherFormat_) {
Put( (value ) & 0xFF );
Put( (value >> 8) & 0xFF );
Put( (value >> 16) & 0xFF );
Put( (value >> 24) & 0xFF );
} else {
Put( (value >> 24) & 0xFF );
Put( (value >> 16) & 0xFF );
Put( (value >> 8) & 0xFF );
Put( (value ) & 0xFF );
}
}

In the ctor I make sure the bufsize is at least 8.
This approach is 15 to 20% slower in a test that writes
to a hard drive than what I posted originally when
thisFormat_ == otherFormat_. Let me know if it is
not what you meant. For now though, I am doubtful
that reserving is going to help.

Brian Wood
http://webEbenezer.net

James Kanze · Dec 16, 2009

That's true, but I'm not sure it matters.

It depends on what you're doing with the data later. If you're
writing
it to disk, it doesn't matter. If you're sending it in packets on the
line, it depends on the protocol involved, but it could matter a lot.

I guess adding a Reserve function would be one way to address this.
I'm not sure the buffering has to be uniform, but perhaps a Reserve
function would be useful in avoiding the check of overflow with each
byte.

Or modify the "Receive" function to check after each byte (or to copy
all that fits, then a second copy for what's left). Or specify
clearly
that whether buffers are full or not isn't specified.

I think that if statement helps performance-wise.
Here's some text from my site:

"With -O3 and 500,000 elements, the Boost Serialization version is
between 2.3 and 2.8 times slower than the Ebenezer version... When no
formatting is needed because the reader has the same format as the
writer, the numbers are near the high end, (2.8), of the range. When
formatting is needed because the two machines have different byte
orders, the numbers are near the low end of the range." That is found
on this page --http://webEbenezer.net/comparison.html.

I'm not familiar with the Boost serialization, so I couldn't say. I
would hope that it achieves total portability, including support for
non-two's complement. Which does add to the cost.

The test I'm describing is writing data to the hard drive.
I pretend in the one case that the formats are different and
measure the time both ways.

I picked up the term "put" from some your previous posts I
think. Well, the buffer is receiving data in order to send
it. That's why I call it Receive.

The function is pushing data out, which is why I would call it Send.
(The client code calls it to send data, not to receive it.)

[...]

I know, but so what?

If you think it doesn't matter, then define it as unspecified. I'm
not
sure that I like the idea that an integer might span two buffers, but
whether it matters really depends on what happens with the buffers
later.

The second form is needed for correctness and both forms put the same
amount of information onto the stream and the ints themselves are in
the same sequence (but with different byte order) in either case. In
"Effective TCP/IP Programming" it says, "There is no such thing as a
'packet' for a TCP application. An application with a design that
depends in any way on how TCP packetizes the data needs to be
rethought."

That's TCP. Applications don't talk to one another in TCP; they use
some higher level protocol. (And of course, they may also use other
protocols, like UDP, for the lower level.)

[...]

The decisions are determined in the connection protocol. I've thought
about making this a template parameter, but didn't like what I was
coming up with when I looked at that. The thisFormat_ isn't changable
after the buffer is constructed. The otherFormat_ is but that is to
permit the same buffer to be used to handle requests from both little
and big endian users. The way things are set up now, it is possible
that a programming error could lead to incorrectly setting the
otherFormat_ in the middle of a connection, but that doesn't seem like
a likely problem to me.

Have you considered the possibility of using the strategy pattern.
It's
possible that a virtual function call could be cheaper than all your
if's, and the resulting code would certainly be more readable.

[...]

I've no idea what you mean by that last sentence.

The impression I have here (but I don't see the entire context---only
what you've posted) is that you're putting the cart before the horse.
Before writing a single line of code, you should specify the protocol,
exactly. From what you seem to be doing, you've got in mind a
protocol
which supports two different representations for integers, and at
least
two for floats. I've seen some protocols which do this, but I'm not
convinced there's any advantage in doing so. (I suspect that most
protocols which support several different representations do so for
historical reasons. They were initially disk based, and read and
wrote
a binary image. Only later, when they attempted to read an image read
on a different machine was the problem realized and addressed. In a
way
that didn't break any existing files.)

Brian · Dec 16, 2009

It depends on what you're doing with the data later. If you're
writing
it to disk, it doesn't matter. If you're sending it in packets on the
line, it depends on the protocol involved, but it could matter a lot.

Or modify the "Receive" function to check after each byte (or to copy
all that fits, then a second copy for what's left). Or specify
clearly
that whether buffers are full or not isn't specified.

I just tried a version of Receive that copies all that fits
and then does a second copy for the balance. The lines
marked with a plus sign are new and are the only thing
that changed.

void
Receive(void const* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
memcpy(buf_ + index_, data, bufsize_ - index_); // +
data += bufsize_ - index_; // +
dlen -= bufsize_ - index_; // +
SendStoredData();

if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}

memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}

The resulting executable is just 200 bytes more, but the time
is over 30% slower than without the change. I'm using a
buffer of size 4096 and the only thing going into the buffer
are 4 byte integers. I also tried it with this:
if (bufsize_ - index_ > 0) {

}

around the three added lines, but that didn't help.
I find this result disappointing as philosophically I could
persuade myself that always filling up buffers makes sense.

Perhaps I'll have one configuration for files and TCP and
another for UDP. Asking files and TCP to pay for making
UDP happy is unreasonable.

That's TCP. Applications don't talk to one another in TCP; they use
some higher level protocol. (And of course, they may also use other
protocols, like UDP, for the lower level.)

I want to support UDP, but it shouldn't be allowed to have such a
big role that it hinders the performance of other protocols.

[...]

The decisions are determined in the connection protocol. I've thought
about making this a template parameter, but didn't like what I was
coming up with when I looked at that. The thisFormat_ isn't changable
after the buffer is constructed. The otherFormat_ is but that is to
permit the same buffer to be used to handle requests from both little
and big endian users. The way things are set up now, it is possible
that a programming error could lead to incorrectly setting the
otherFormat_ in the middle of a connection, but that doesn't seem like
a likely problem to me.

Click to expand...

Have you considered the possibility of using the strategy pattern.
It's
possible that a virtual function call could be cheaper than all your
if's, and the resulting code would certainly be more readable.

Yes, and I agree it would be helpful. Maybe that is the
appropriate next step. I was thinking about it from reading
previous posts where you suggested it.

[...]

I've no idea what you mean by that last sentence.

Click to expand...

The impression I have here (but I don't see the entire context---only
what you've posted) is that you're putting the cart before the horse.
Before writing a single line of code, you should specify the protocol,
exactly.

I'm not that rigid (or omniscient). I like to have some fun
once in a while. Anyway, I'm happy with my approach to things.
It doesn't make perfect sense I guess, but eventually I get to
a good place.

Brian Wood
http://webEbenezer.net

Brian · Dec 17, 2009

The following functions come from this file
--http://webEbenezer.net/Buffer.hh.
void
Receive(const void* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
SendStoredData();
if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}
memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}
void
Put(unsigned char byte)
{
if (index_ >= bufsize_) {
SendStoredData();
}

Click to expand...

buf_[index_] = byte;
++index_;
}
void
Receive(uint8_t value)
{
Put(value);
}

Click to expand...

I'm a little bit confused by your names. Are you receiving
(getting) or sending (putting)?

void
Receive(uint16_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
if (least_significant_first == otherFormat_) {
Put( (value ) & 0xFF );
Put( (value >> 8) & 0xFF );
} else {
Put( (value >> 8) & 0xFF );
Put( (value ) & 0xFF );
}
}
}

Click to expand...

I'd expose a little bit more of the buffer details here,
perhaps with a "reserve" function, so that you don't have to
test if there's place for each byte. Note, in fact, that your
bufferization is different if the formats are the same. Which
do you want: to ensure that both bytes are in the same buffer,
or that the buffer is as full as possible?

Having decided the buffering strategy, there's no point in the
first if. It just adds complexity, for no gain.

And I'd definitly call this function Send, and not Receive.

void
Receive(uint32_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
if (least_significant_first == otherFormat_) {
Put( (value ) & 0xFF );
Put( (value >> 8) & 0xFF );
Put( (value >> 16) & 0xFF );
Put( (value >> 24) & 0xFF );
} else {
Put( (value >> 24) & 0xFF );
Put( (value >> 16) & 0xFF );
Put( (value >> 8) & 0xFF );
Put( (value ) & 0xFF );
}
}
}

Click to expand...

Same comments as above.

void
Receive(int8_t value)
{
uint8_t tmp = value;
Receive(tmp);
}
void
Receive(int16_t value)
{
uint16_t tmp = value;
Receive(tmp);
}
void
Receive(int32_t value)
{
uint32_t tmp = value;
Receive(tmp);
}
The link that I mentioned also has Receive functions for
uint64_t and int64_t. I was happy with those functions until
I remembered that I aim to support an optimization when basic
(numeric) data is stored contiguously in a std::vector, Boost
Array or regular array. To support that I added the following
function:
template <typename T>
void
ReceiveBlock(T const* data, unsigned int elements)
{
if (thisFormat_ == otherFormat_) {
Receive(data, elements * sizeof(T));
} else {
for (unsigned int ii = 0; ii < elements; ++ii) {
Receive(*(data + ii));
}
}
}

Click to expand...

Again, the two branches of the if implement different buffering
strategies. Before going any further, I think you have to
define this.

If for example, a vector<uint16_t> is going to be marshalled,
some code like this will be produced, where abt1 is a
reference to a const vector<uint16_t>:
// headCount is set to the number of elements in the
// vector and is marshalled first.
// Then the elements are marshalled with:
buf->ReceiveBlock(&(*abt1.begin()), headCount);
The member variables thisFormat_ and otherFormat_ used in
ReceiveBlock might be better named thisByteOrder_ and
otherByteOrder_. If ReceiveBlock determines that the byte
orders are different, it is going to call a Receive function
which is once again going to check the two byte orders. I'd
like to rework this code so that check isn't performed
unnecessarily.

Click to expand...

Any decisions concerning data format should be negotiated in the
connection protocol. They shouldn't be evaluated on the fly,
nor change during a connection.

One idea I had was to overload the Put function for uint16_t,
uint32_t and uint64_t. For some reason I'm not really
comfortable with that, (I'm not sure if some compilers may do
things a little differently with the shifting and &'ing and
match a function I'm not expecting.)

Click to expand...

If you're worried about which function is chosen by overload
resolution, you can always add a cast to ensure an exact match.
But if you're worried about which function is chosen by overload
resolution, the functions probably shouldn't have the same name
to begin with.

Here's what I've got now

void
Receive(void const* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
#ifdef UDP
memcpy(buf_ + index_, data, bufsize_ - index_);
data += bufsize_ - index_;
dlen -= bufsize_ - index_;
#endif
SendStoredData();

if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}

memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}

void
PutOne(unsigned char byte)
{
if (index_ >= bufsize_) {
SendStoredData();
}

buf_[index_] = byte;
++index_;
}

void
Put(uint8_t value)
{
PutOne(value);
}

void
Put(uint16_t value)
{
if (least_significant_first == otherFormat_) {
PutOne( (value ) & 0xFF );
PutOne( (value >> 8) & 0xFF );
} else {
PutOne( (value >> 8) & 0xFF );
PutOne( (value ) & 0xFF );
}
}

void
Put(uint32_t value)
{
if (least_significant_first == otherFormat_) {
PutOne( (value ) & 0xFF );
PutOne( (value >> 8) & 0xFF );
PutOne( (value >> 16) & 0xFF );
PutOne( (value >> 24) & 0xFF );
} else {
PutOne( (value >> 24) & 0xFF );
PutOne( (value >> 16) & 0xFF );
PutOne( (value >> 8) & 0xFF );
PutOne( (value ) & 0xFF );
}
}

void
Put(uint64_t value)
{
if (least_significant_first == otherFormat_) {
PutOne( (value ) & 0xFF );
PutOne( (value >> 8) & 0xFF );
PutOne( (value >> 16) & 0xFF );
PutOne( (value >> 24) & 0xFF );
PutOne( (value >> 32) & 0xFF );
PutOne( (value >> 40) & 0xFF );
PutOne( (value >> 48) & 0xFF );
PutOne( (value >> 56) & 0xFF );
} else {
PutOne( (value >> 56) & 0xFF );
PutOne( (value >> 48) & 0xFF );
PutOne( (value >> 40) & 0xFF );
PutOne( (value >> 32) & 0xFF );
PutOne( (value >> 24) & 0xFF );
PutOne( (value >> 16) & 0xFF );
PutOne( (value >> 8) & 0xFF );
PutOne( (value ) & 0xFF );
}
}

void
Put(int8_t value)
{
uint8_t tmp = value;
PutOne(tmp);
}

void
Put(int16_t value)
{
uint16_t tmp = value;
Put(tmp);
}

void
Put(int32_t value)
{
uint32_t tmp = value;
Put(tmp);
}

void
Put(int64_t value)
{
uint64_t tmp = value;
Put(tmp);
}

void
Receive(uint8_t value)
{
PutOne(value);
}

void
Receive(uint16_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
Put(value);
}
}

void
Receive(uint32_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
Put(value);
}
}

void
Receive(uint64_t value)
{
if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
Put(value);
}
}

void
Receive(int8_t value)
{
uint8_t tmp = value;
Receive(tmp);
}

void
Receive(int16_t value)
{
uint16_t tmp = value;
Receive(tmp);
}

void
Receive(int32_t value)
{
uint32_t tmp = value;
Receive(tmp);
}

void
Receive(int64_t value)
{
uint64_t tmp = value;
Receive(tmp);
}

template <typename T>
void
ReceiveBlock(T const* data, unsigned int elements)
{
if (thisFormat_ == otherFormat_) {
Receive(data, elements * sizeof(T));
} else {
for (unsigned int ii = 0; ii < elements; ++ii) {
Put(*(data + ii));
}
}
}

The Put's that take an int*_t are only called by RecieveBlock.
The Put's that take an uint*_t are called by either Receive
functions or ReceiveBlock. The code generator doesn't
directly call any of the Put functions.

The following:

if (thisFormat_ == otherFormat_) {
Receive(&value, sizeof(value));
} else {
Put(value);
}

is the whole of the function body for Receive
when the arg is uint16_t, uint32_t or uint64_t.
Would you make that a separate function or
possibly use a local macro?

Brian Wood
http://webEbenezer.net

James Kanze · Dec 17, 2009

[...]

I just tried a version of Receive that copies all that fits and then
does a second copy for the balance. The lines marked with a plus sign
are new and are the only thing that changed.

void
Receive(void const* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
memcpy(buf_ + index_, data, bufsize_ - index_); // +
data += bufsize_ - index_; // +
dlen -= bufsize_ - index_; // +
SendStoredData();

if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}
memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}

The resulting executable is just 200 bytes more, but the time
is over 30% slower than without the change.

That doesn't sound right. The difference is far too big.

But the real difference would be downstream: by filling every buffer
to
the maximum, you need less buffers. Which means that downstream,
there
are less buffers to handle.

I'm using a buffer of size 4096 and the only thing going into the
buffer are 4 byte integers. I also tried it with this:
if (bufsize_ - index_ > 0) {

around the three added lines, but that didn't help. I find this
result disappointing as philosophically I could persuade myself that
always filling up buffers makes sense.

Perhaps I'll have one configuration for files and TCP and another for
UDP. Asking files and TCP to pay for making UDP happy is
unreasonable.

I'm fairly sure that you're worrying about the wrong things, and that
the difference won't be significant in a real application.

I want to support UDP, but it shouldn't be allowed to have such a
big role that it hinders the performance of other protocols.

At the application level, it's neither TCP nor UDP, but an application
level protocol.

[...]

I think you have to define a higher level protocol to begin
with. (And although it's possible, and I've seen at least some
formats which do so, I'm not convinced that there's any
advantage in supported different representations.)
I've no idea what you mean by that last sentence.

Click to expand...

The impression I have here (but I don't see the entire
context---only what you've posted) is that you're putting the cart
before the horse. Before writing a single line of code, you should
specify the protocol, exactly.

Click to expand...

I'm not that rigid (or omniscient). I like to have some fun once in a
while. Anyway, I'm happy with my approach to things. It doesn't make
perfect sense I guess, but eventually I get to a good place.

If it's for fun, or even more or less a learning experience, fine.

Brian · Dec 17, 2009

[...]

I just tried a version of Receive that copies all that fits and then
does a second copy for the balance. The lines marked with a plus sign
are new and are the only thing that changed.
void
Receive(void const* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
memcpy(buf_ + index_, data, bufsize_ - index_); // +
data += bufsize_ - index_; // +
dlen -= bufsize_ - index_; // +
SendStoredData();

Click to expand...

if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}
memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}
The resulting executable is just 200 bytes more, but the time
is over 30% slower than without the change.

Click to expand...

That doesn't sound right. The difference is far too big.

I have retested now on Linux and tested for the first time
on Windows. On Linux I still get this large difference.
I'll post links to the source if you like. The original
version takes around 16000 microseconds; the full-buffers
version around 24000 microseconds. The full-buffers
version gets a warning:
"warning: pointer of type ‘void *’ used in arithmetic"
that the original version doesn't get. I'm using gcc
4.4.2 with -O3 and -std=c++0x to build the tests.
The test is marshalling a list<int32_t> with 500,000
elements to the disk. The output file in both cases
is 2,000,004 bytes.

On Windows I didn't notice a performance difference
between the two versions. The two executables are
exactly the same size though. I'm not aware of "strip"
or "cmp" commands on Windows so am not 100% sure that
the two files are different.

But the real difference would be downstream: by filling every buffer
to
the maximum, you need less buffers. Which means that downstream,
there
are less buffers to handle.

Before adopting this approach I'd like to be sure that the
performance on Linux is not perturbed. I find the results
to be surprising, but not shocking.

I'm fairly sure that you're worrying about the wrong things, and that
the difference won't be significant in a real application.

I don't think so. As Aby Warburg put it, "G-d dwells in minutiae."
I want to know why the Linux version performs poorly.

http://webEbenezer.net

Brian · Dec 17, 2009

[...]

Which do you want: to ensure that both bytes are in the same
buffer, or that the buffer is as full as possible?
I guess adding a Reserve function would be one way to address
this. I'm not sure the buffering has to be uniform, but perhaps a
Reserve function would be useful in avoiding the check of overflow
with each byte.
Or modify the "Receive" function to check after each byte (or to
copy all that fits, then a second copy for what's left). Or specify
clearly that whether buffers are full or not isn't specified.
I just tried a version of Receive that copies all that fits and then
does a second copy for the balance. The lines marked with a plus sign
are new and are the only thing that changed.
void
Receive(void const* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
memcpy(buf_ + index_, data, bufsize_ - index_); // +
data += bufsize_ - index_; // +
dlen -= bufsize_ - index_; // +
SendStoredData();
if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}
memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}
The resulting executable is just 200 bytes more, but the time
is over 30% slower than without the change.

Click to expand...

Click to expand...

That doesn't sound right. The difference is far too big.

Click to expand...

I have retested now on Linux and tested for the first time
on Windows. On Linux I still get this large difference.
I'll post links to the source if you like. The original
version takes around 16000 microseconds; the full-buffers
version around 24000 microseconds. The full-buffers
version gets a warning:
"warning: pointer of type ‘void *’ used in arithmetic"
that the original version doesn't get.

I added a variable to eliminate the warning:

unsigned char const* d2 = static_cast<unsigned char const*>
(data);
if (dlen > bufsize_ - index_) {
memcpy(buf_ + index_, d2, bufsize_ - index_);
d2 += bufsize_ - index_;
dlen -= bufsize_ - index_;
SendStoredData();

if (dlen > bufsize_) {
PersistentWrite(d2, dlen);
return;
}
}

memcpy(buf_ + index_, d2, dlen);
index_ += dlen;

-----------

But that doesn't do anything for the performance
of that version; it's still slooooow.

http://webEbenezer.net

James Kanze · Dec 18, 2009

On 15 Dec, 23:21, Brian <[email protected]> wrote:

Click to expand...

[...]

Which do you want: to ensure that both bytes are in the same
buffer, or that the buffer is as full as possible?
I guess adding a Reserve function would be one way to address
this. I'm not sure the buffering has to be uniform, but
perhaps a Reserve function would be useful in avoiding the
check of overflow with each byte.
Or modify the "Receive" function to check after each byte (or to
copy all that fits, then a second copy for what's left). Or
specify clearly that whether buffers are full or not isn't
specified.
I just tried a version of Receive that copies all that fits and
then does a second copy for the balance. The lines marked with a
plus sign are new and are the only thing that changed.
void
Receive(void const* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
memcpy(buf_ + index_, data, bufsize_ - index_); // +
data += bufsize_ - index_; // +
dlen -= bufsize_ - index_; // +
SendStoredData();
if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}
memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}
The resulting executable is just 200 bytes more, but the time
is over 30% slower than without the change.

Click to expand...

That doesn't sound right. The difference is far too big.

Click to expand...

I have retested now on Linux and tested for the first time on Windows.
On Linux I still get this large difference. I'll post links to the
source if you like. The original version takes around 16000
microseconds; the full-buffers version around 24000 microseconds. The
full-buffers version gets a warning:
"warning: pointer of type ‘void *’ used in arithmetic"
that the original version doesn't get.

That should be an error. Arithmetic on void* is illegal.

I'm using gcc 4.4.2 with -O3 and -std=c++0x to build the tests. The
test is marshalling a list<int32_t> with 500,000 elements to the disk.
The output file in both cases is 2,000,004 bytes.

On Windows I didn't notice a performance difference between the two
versions. The two executables are exactly the same size though. I'm
not aware of "strip" or "cmp" commands on Windows so am not 100% sure
that the two files are different.

For starters, you might be measuring differences in the way the system
caches file data. I suspect that for any larger amount of data
(enough
to significantly overflow all system caches), file access (or serial
line access) will overwhelm all timing measures. (The last time I
measured, on a Sun Sparc, a hard write to disk cost about 10 ms. And
after a couple of meg of output, all output ended up having to wait
for
a hard write. From what other people have told me, PC's are
significantly slower in this regard, and Windows buffers a lot less in
the system than Linux. But both of those statements are just hearsay,
at least coming from me.)

You have raised an interesting point, however, and if I can find the
time to get my benchmark harnesses up and running on the machines I
now
have access to (none of which I had access to three months ago), I'll
give it a try. For basic algorithms, each copying a buffer of 1024
int
(just to have something comparable): memcpy, std::copy, a simple hand
written loop, and a loop using shifting to portably ensure byte order
(and why not, std::transform with a unary function doing the
shifting).
This should give a good idea as to how well the compilers handle such
cases (because it really is an issue of compiler optimization).
Whether
any differences are important in real life, of course, depends on the
magnitude of the differences, and what else you're doing with the
data:
as I said, if you're writing really larger amounts to disk (or the
application requires synchronous writes), even the worst case of the
above probably represents less than 1% of the time.

Before adopting this approach I'd like to be sure that the
performance on Linux is not perturbed. I find the results
to be surprising, but not shocking.

I don't think so. As Aby Warburg put it, "G-d dwells in minutiae."
I want to know why the Linux version performs poorly.

That quote has been attributed to many different people

. In this
case, it's really a question of which details are important. You've
not
explained what measures you're taking to ensure similar cache
behavior,
for example, and similar buffering in output.

But the fact that one implementation shows significantly different
performance is interesting, and probably worth pursuing.

Brian · Dec 18, 2009

The decisions are determined in the connection protocol.
I've thought about making this a template parameter, but
didn't like what I was coming up with when I looked at
that. The thisFormat_ isn't changable after the buffer
is constructed. The otherFormat_ is but that is to permit
the same buffer to be used to handle requests from both
little and big endian users. The way things are set up
now, it is possible that a programming error could lead
to incorrectly setting the otherFormat_ in the middle of
a connection, but that doesn't seem like a likely problem
to me.

I've changed Buffer to be a class template that takes
a formatting type. There are three format types --
SameFormat, LeastSignificantFirst and
MostSignificantFirst. I've copied almost the whole
file here.

#ifndef Buffer_hh
#define Buffer_hh

#include <climits>
#if CHAR_BIT != 8
#error Only 8 bit char supported
#endif

#if defined(_MSC_VER) || defined(WIN32) || defined(_WIN32) || defined
(__WIN32__) || defined(__CYGWIN__)
#include <windows.h>
#else
#include <errno.h>
#include <sys/ioctl.h>
#include <unistd.h>
#endif

#include <string.h>

#include <algorithm>
#include <cstdint>
#include <iostream>
#include <ErrorWordsShepherd.hh>

uint8_t const least_significant_first = 0;
uint8_t const most_significant_first = 1;

inline
uint8_t
MachineByteOrder()
{
int i = 1;
char *p = (char *) &i;
if (1 == p[0]) { // Looks like little endian
return least_significant_first;
} else {
return most_significant_first; // probably big endian
}
}

template <typename W>
class Buffer;

class SameFormat
{
public:
template <typename U>
void
Write(Buffer<SameFormat>& buf, U data);

template <typename U>
void
WriteBlock(Buffer<SameFormat>& buf, U const* data, unsigned int
elements);
};

class LeastSignificantFirst
{
public:
inline void Write(Buffer<LeastSignificantFirst>& buf, uint16_t);
inline void Write(Buffer<LeastSignificantFirst>& buf, uint32_t);
inline void Write(Buffer<LeastSignificantFirst>& buf, uint64_t);

template <typename U>
inline void
WriteBlock(Buffer<LeastSignificantFirst>& buf,
U const* data, unsigned int elements);

inline void
WriteBlock(Buffer<LeastSignificantFirst>& buf,
uint8_t const* data, unsigned int elements);
inline void
WriteBlock(Buffer<LeastSignificantFirst>& buf,
int8_t const* data, unsigned int elements);
};

class MostSignificantFirst
{
public:
inline void Write(Buffer<MostSignificantFirst>& buf, uint16_t);
inline void Write(Buffer<MostSignificantFirst>& buf, uint32_t);
inline void Write(Buffer<MostSignificantFirst>& buf, uint64_t);

template <typename U>
inline void
WriteBlock(Buffer<MostSignificantFirst>& buf,
U const* data, unsigned int elements);

inline void
WriteBlock(Buffer<MostSignificantFirst>& buf,
uint8_t const* data, unsigned int elements);
inline void
WriteBlock(Buffer<MostSignificantFirst>& buf,
int8_t const* data, unsigned int elements);
};

template <typename W>
class Buffer
{
unsigned int bufsize_;
unsigned int vsize_;
unsigned int index_;
unsigned int recIndex_;
#if defined(_MSC_VER) || defined(WIN32) || defined(_WIN32) || defined
(__WIN32__) || defined(__CYGWIN__)
DWORD minBytesAvailable_;
#else
unsigned int minBytesAvailable_;
#endif
unsigned int bytesRemainingInMsg_;
unsigned char* buf_;
W writer_;

public:
#if defined(_MSC_VER) || defined(WIN32) || defined(_WIN32) || defined
(__WIN32__) || defined(__CYGWIN__)
#ifdef EE_FILEIO
HANDLE sock_;
#else
SOCKET sock_;
#endif
#else
int sock_;
#endif

explicit Buffer(unsigned int bufsize) :
bufsize_(bufsize), vsize_(0), index_(0), recIndex_(0),
minBytesAvailable_(0), bytesRemainingInMsg_(0)
{
if (bufsize_ < 8) {
throw failure("buffer size too small");
}
buf_ = new unsigned char[bufsize_];
}

~Buffer()
{
delete [] buf_;
}

#if defined(_MSC_VER) || defined(WIN32) || defined(_WIN32) || defined
(__WIN32__) || defined(__CYGWIN__)
void
PersistentWrite(void const* data, int len)
{
char* d2 = (char*) data;
#ifdef EE_FILEIO
DWORD bytesWritten = 0;
BOOL res = WriteFile(sock_, d2, len, &bytesWritten, NULL);
if (!res) {
throw failure("PersistentWrite -- WriteFile");
}
#else
int rc;
while ((rc = send(sock_, d2, len, 0)) != len) {
if (rc < 0) {
throw failure("PersistentWrite -- send");
}

len -= rc;
d2 += rc;
}
#endif
}

void
PersistentRead(void* data, int len)
{
char* d2 = (char*) data;
#ifdef EE_FILEIO
DWORD bytesRead = 0;
BOOL res = ReadFile(sock_, d2, len, &bytesRead, NULL);
if (!res) {
throw failure("PersistentRead -- ReadFile");
}
#else
int rc;
while ((rc = recv(sock_, d2, len, 0)) != len) {
if (rc < 0) {

} else {
if (rc == 0) {
}
len -= rc;
d2 += rc;
}
}
#endif
}

#else

void
PersistentWrite(void const* data, int len)
{
int rc;
unsigned char const* d2 = static_cast<unsigned char const*>
(data);
while ((rc = write(sock_, d2, len)) != len) {
if (rc == -1) {
throw failure("Buffer:

ersistentWrite -- write");
}

len -= rc;
d2 += rc;
}
}

void
PersistentRead(void* data, int len)
{
int rc;
unsigned char* d2 = static_cast<unsigned char*> (data);
while ((rc = read(sock_, d2, len)) != len) {
if (rc < 0) {
throw failure("Buffer:

ersistentRead-- read -- rc == -1");
}
else {
if (rc == 0) {
throw failure("Buffer:

ersistentRead-- read -- rc == 0");
}

len -= rc;
d2 += rc;
}
}
}
#endif

void
SendStoredData()
{
if (index_ > 0) {
PersistentWrite(buf_, index_);
index_ = 0;
}
}

void
Resize(unsigned int newsize)
{
if (newsize < index_) {
SendStoredData();
}

unsigned char* tmp = new unsigned char[newsize];
memcpy(tmp, buf_, index_);
delete [] buf_;
buf_ = tmp;
bufsize_ = newsize;
}

void
Receive(void const* data, unsigned int dlen)
{
if (dlen > bufsize_ - index_) {
#ifdef UDP
memcpy(buf_ + index_, data, bufsize_ - index_);
data += bufsize_ - index_;
dlen -= bufsize_ - index_;
#endif
SendStoredData();

if (dlen > bufsize_) {
PersistentWrite(data, dlen);
return;
}
}

memcpy(buf_ + index_, data, dlen);
index_ += dlen;
}

void
Put(unsigned char byte)
{
if (index_ >= bufsize_) {
SendStoredData();
}

buf_[index_] = byte;
++index_;
}

void
Receive(uint8_t value)
{
Put(value);
}

void
Receive(uint16_t value)
{
writer_.Write(*this, value);
}

void
Receive(uint32_t value)
{
writer_.Write(*this, value);
}

void
Receive(uint64_t value)
{
writer_.Write(*this, value);
}

void
Receive(int8_t value)
{
uint8_t tmp = value;
Put(tmp);
}

void
Receive(int16_t value)
{
uint16_t tmp = value;
writer_.Write(*this, tmp);
}

void
Receive(int32_t value)
{
uint32_t tmp = value;
writer_.Write(*this, tmp);
}

void
Receive(int64_t value)
{
uint64_t tmp = value;
writer_.Write(*this, tmp);
}

void
Receive(float value)
{
uint32_t tmp;
memcpy(&tmp, &value, sizeof tmp);
writer_.Write(*this, tmp);
}

void
Receive(double value)
{
uint64_t tmp;
memcpy(&tmp, &value, sizeof tmp);
writer_.Write(*this, tmp);
}

template <typename T>
void
ReceiveBlock(T const* data, unsigned int elements)
{
writer_.WriteBlock(*this, data, elements);
}

#if defined(_MSC_VER) || defined(WIN32) || defined(_WIN32) || defined
(__WIN32__) || defined(__CYGWIN__)
void
Give(void* address, unsigned int len)
{
using std::min;
if (len > bytesRemainingInMsg_) {
throw failure("Buffer::Give -- len > bytesRemainingInMsg_");
}

char * addr = static_cast<char*> (address);
if (vsize_ > recIndex_) {
if (len <= (vsize_ - recIndex_)) {
memcpy(addr, buf_ + recIndex_, len);
recIndex_ += len;
}
else {
memcpy(addr, buf_ + recIndex_, vsize_ - recIndex_);
addr += (vsize_ - recIndex_);
len -= (vsize_ - recIndex_);
bytesRemainingInMsg_ -= (vsize_ - recIndex_);
recIndex_ = vsize_;
goto pull;
}
}
else {
pull:
int rc = 0;
if (minBytesAvailable_ < len) {
#ifdef EE_FILEIO
minBytesAvailable_ = bufsize_;
#else
if ((rc = ioctlsocket(sock_, FIONREAD, &minBytesAvailable_))
==
SOCKET_ERROR) {
throw failure("Buffer::Give -- ioctl");
}
#endif
}

if (len > bufsize_ || len > minBytesAvailable_) {
PersistentRead(addr, len);
minBytesAvailable_ = 0;
}
else {
vsize_ =
min(min(bufsize_, bytesRemainingInMsg_),
minBytesAvailable_);
minBytesAvailable_ -= vsize_;
#ifdef EE_FILEIO
PersistentRead(buf_, vsize_);
#else
if ((rc = recv(sock_, reinterpret_cast<char*> (buf_), vsize_,
0))
!= vsize_) {
throw failure("Buffer::Give -- recv");
}
#endif
memcpy(addr, buf_, len);
recIndex_ = len;
}
}
bytesRemainingInMsg_ -= len;
}

#else
void
Give(void* address, unsigned int len)
{
using std::min;
if (len > bytesRemainingInMsg_) {
throw failure("Buffer::Give -- len > bytesRemainingInMsg_");
}

char * addr = static_cast<char *> (address);
if (vsize_ > recIndex_) {
if (len <= (vsize_ - recIndex_)) {
memcpy(addr, buf_ + recIndex_, len);
recIndex_ += len;
}
else {
memcpy(addr, buf_ + recIndex_, vsize_ - recIndex_);
addr += (vsize_ - recIndex_);
len -= (vsize_ - recIndex_);
bytesRemainingInMsg_ -= (vsize_ - recIndex_);
recIndex_ = vsize_;
goto pull;
}
}
else {
pull:
int rc = 0;
if (minBytesAvailable_ < len) {
if ((rc = ioctl(sock_, FIONREAD, &minBytesAvailable_)) == -1)
{
throw failure("Buffer::Give -- ioctl");
}
}

if (len > bufsize_ || len > minBytesAvailable_) {
PersistentRead(addr, len);
minBytesAvailable_ = 0;
}
else {
vsize_ =
min(min(bufsize_, bytesRemainingInMsg_),
minBytesAvailable_);
minBytesAvailable_ -= vsize_;
if ((rc = read(sock_, buf_, vsize_)) != vsize_) {
throw failure("Buffer::Give -- read");
}
memcpy(addr, buf_, len);
recIndex_ = len;
}
}
bytesRemainingInMsg_ -= len;
}
#endif

void
Reset()
{
index_ = 0;
recIndex_ = 0;
vsize_ = 0;
minBytesAvailable_ = 0;
bytesRemainingInMsg_ = 0;
}

void
SetMsgLength(unsigned int newMsgLength)
{
bytesRemainingInMsg_ = newMsgLength;
}

private:
Buffer(Buffer const&);
Buffer& operator=(Buffer const&);

};

template <typename U>
void
SameFormat::Write(Buffer<SameFormat>& buf, U data)
{
buf.Receive(&data, sizeof data);
};

template <typename U>
void
SameFormat::WriteBlock(Buffer<SameFormat>& buf,
U const* data, unsigned int elements)
{
buf.Receive(data, elements * sizeof(U));
}

inline void
LeastSignificantFirst::Write(Buffer<LeastSignificantFirst>& buf,
uint16_t value)
{
buf.Put( (value ) & 0xFF );
buf.Put( (value >> 8) & 0xFF );
}

inline void
LeastSignificantFirst::Write(Buffer<LeastSignificantFirst>& buf,
uint32_t value)
{
buf.Put( (value ) & 0xFF );
buf.Put( (value >> 8) & 0xFF );
buf.Put( (value >> 16) & 0xFF );
buf.Put( (value >> 24) & 0xFF );
}

inline void
LeastSignificantFirst::Write(Buffer<LeastSignificantFirst>& buf,
uint64_t value)
{
buf.Put( (value ) & 0xFF );
buf.Put( (value >> 8) & 0xFF );
buf.Put( (value >> 16) & 0xFF );
buf.Put( (value >> 24) & 0xFF );
buf.Put( (value >> 32) & 0xFF );
buf.Put( (value >> 40) & 0xFF );
buf.Put( (value >> 48) & 0xFF );
buf.Put( (value >> 56) & 0xFF );
}

template <typename U>
inline void
LeastSignificantFirst::WriteBlock(Buffer<LeastSignificantFirst>& buf,
U const* data, unsigned int
elements)
{
for (unsigned int ii = 0; ii < elements; ++ii) {
buf.Receive(*(data + ii));
}
}

// Two overloads for when U is uint8_t or int8_t
inline void
LeastSignificantFirst::WriteBlock(Buffer<LeastSignificantFirst>& buf,
uint8_t const* data, unsigned int
elements)
{
buf.Receive(data, elements);
}

inline void
LeastSignificantFirst::WriteBlock(Buffer<LeastSignificantFirst>& buf,
int8_t const* data, unsigned int
elements)
{
buf.Receive(data, elements);
}

inline void
MostSignificantFirst::Write(Buffer<MostSignificantFirst>& buf,
uint16_t value)
{
buf.Put( (value >> 8) & 0xFF );
buf.Put( (value ) & 0xFF );
}

inline void
MostSignificantFirst::Write(Buffer<MostSignificantFirst>& buf,
uint32_t value)
{
buf.Put( (value >> 24) & 0xFF );
buf.Put( (value >> 16) & 0xFF );
buf.Put( (value >> 8) & 0xFF );
buf.Put( (value ) & 0xFF );
}

inline void
MostSignificantFirst::Write(Buffer<MostSignificantFirst>& buf,
uint64_t value)
{
buf.Put( (value >> 56) & 0xFF );
buf.Put( (value >> 48) & 0xFF );
buf.Put( (value >> 40) & 0xFF );
buf.Put( (value >> 32) & 0xFF );
buf.Put( (value >> 24) & 0xFF );
buf.Put( (value >> 16) & 0xFF );
buf.Put( (value >> 8) & 0xFF );
buf.Put( (value ) & 0xFF );
}

template <typename U>
inline void
MostSignificantFirst::WriteBlock(Buffer<MostSignificantFirst>& buf,
U const* data, unsigned int
elements)
{
for (unsigned int ii = 0; ii < elements; ++ii) {
buf.Receive(*(data + ii));
}
}

inline void
MostSignificantFirst::WriteBlock(Buffer<MostSignificantFirst>& buf,
uint8_t const* data, unsigned int
elements)
{
buf.Receive(data, elements);
}

inline void
MostSignificantFirst::WriteBlock(Buffer<MostSignificantFirst>& buf,
int8_t const* data, unsigned int
elements)
{
buf.Receive(data, elements);
}

#endif

The names of some of the functions are likely to change.
I've just taken the fast route in putting this together.
The performance of this version is better than the non-
template version, but the improvement looks to be minor.
An executable built with the new version is significantly
smaller than it formerly was.

This new version isn't available on line yet. In the
past we added an option that allows users to specify
whether they want the code in header-only (integrated)
form or as two separate files. This was in part
because of some comments made by Kanze about build
times I think. Any thoughts on how to reconcile this
template use with that support? I'm not sure if it's
possible to keep supporting the separate header and
implementation approach with these changes.

Brian Wood
http://webEbenezer.net

Brian Wood · Dec 20, 2009

I've changed Buffer to be a class template that takes
a formatting type. There are three format types --
SameFormat, LeastSignificantFirst and
MostSignificantFirst. I've copied almost the whole
file here.

That last post was long and I had a question at the
end. Here's some of what was at the end of that
post.

The names of some of the classes/functions are likely
to change. I've just taken the fast route in putting
this together. The performance of this version is
better than the non-template version, but the
improvement looks to be minor. An executable built
with the new version is significantly smaller than it
formerly was.

In the past we added an option that allows users to
specify whether they want the code in header-only
(integrated) form or as two separate files. This was
in part because of some comments made by Kanze about
build times I think. Any thoughts on how to reconcile
this template use with that support? I'm not sure if
it's possible to keep supporting the separate header
and implementation approach with these changes.

Brian Wood
http://webEbenezer.net

Brian · Jan 6, 2010

That last post was long and I had a question at the
end. Here's some of what was at the end of that
post.

The names of some of the classes/functions are likely
to change. I've just taken the fast route in putting
this together. The performance of this version is
better than the non-template version, but the
improvement looks to be minor. An executable built
with the new version is significantly smaller than it
formerly was.

In the past we added an option that allows users to
specify whether they want the code in header-only
(integrated) form or as two separate files. This was
in part because of some comments made by Kanze about
build times I think. Any thoughts on how to reconcile
this template use with that support? I'm not sure if
it's possible to keep supporting the separate header
and implementation approach with these changes.

Any thoughts on this matter?

Brian Wood
http://webEbenezer.net

Emplace question	2	May 17, 2010
Qsort() messing with my entire Code	0	Apr 25, 2022
Qsort() is messing with my entire Code!!!	0	Apr 25, 2022
Marshalling auto_ptr/unique_ptr objects	1	Mar 6, 2010
Scoped enum question	21	Jun 10, 2013
Release 1.13 of the C++ Middleware Writer now on line	2	May 8, 2010
Voting a type off the island	0	Apr 21, 2013
Portable marshalling of floating point data	3	Dec 10, 2009

Refactoring question

Brian

James Kanze

Brian

Maxim Yegorushkin

Brian

James Kanze

Brian

Brian

James Kanze

Brian

Brian

James Kanze

Brian

Brian Wood

Brian

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads