convert std::string to (byte*, DWORD)

K

Khuong Dinh Pham

I have the contents of an image of type std::string. How can I make a
CxImage object with this type.

The parameters to CxImage is:

CxImage(byte* data, DWORD size)

Thx in advance
 
G

Gianni Mariani

Khuong said:
I have the contents of an image of type std::string. How can I make a
CxImage object with this type.

The parameters to CxImage is:

CxImage(byte* data, DWORD size)

std::string stuff;

CxImage( convert_to_byte( stuff.data() ), stuff.size() )

convert_to_byte could be as simple as:

const_cast<byte *>( stuff.data() )

good luck
 
L

Larry I Smith

Khuong said:
I have the contents of an image of type std::string. How can I make a
CxImage object with this type.

The parameters to CxImage is:

CxImage(byte* data, DWORD size)

Thx in advance


std::string.c_str() returns a 'const char *' (a pointer
to a string that can not be modified).

std::string.length() returns the data length
of the content.

Since CxImage() takes a 'byte *' rather than a
'const byte *', I would expect that it modifies
the memory pointed to by its first arg ('data').

If CxImage() does NOT modify the memory pointed to by its
first arg ('data'), and your image is in a std::string named
'myStr', then you might try something like this:

CxImage(myStr.c_str(), myStr.length());

You may have to add some casts to convert the first arg
to 'byte *' and the 2nd arg to 'DWORD'.

If CxImage() DOES modify the memory pointed to by its first
arg ('data'), then you could copy the string to a char buf:

try {
char * buf = new char[myStr.length()];
memcpy(buf, myStr.c_str(), myStr.length());
CxImage(buf, myStr.length());
delete[] buf;
}
catch (std::bad_alloc) {
std::cerr << "memory allocation failed\n";
}

Larry
 
R

red floyd

Larry said:
Khuong said:
I have the contents of an image of type std::string. How can I make a
CxImage object with this type.

The parameters to CxImage is:

CxImage(byte* data, DWORD size)

Thx in advance



std::string.c_str() returns a 'const char *' (a pointer
to a string that can not be modified).

std::string.length() returns the data length
of the content.

Since CxImage() takes a 'byte *' rather than a
'const byte *', I would expect that it modifies
the memory pointed to by its first arg ('data').

If CxImage() does NOT modify the memory pointed to by its
first arg ('data'), and your image is in a std::string named
'myStr', then you might try something like this:

CxImage(myStr.c_str(), myStr.length());

You may have to add some casts to convert the first arg
to 'byte *' and the 2nd arg to 'DWORD'.

If CxImage() DOES modify the memory pointed to by its first
arg ('data'), then you could copy the string to a char buf:

try {
char * buf = new char[myStr.length()];
memcpy(buf, myStr.c_str(), myStr.length());
CxImage(buf, myStr.length());
delete[] buf;
}
catch (std::bad_alloc) {
std::cerr << "memory allocation failed\n";
}

Larry

I'd prefer:

std::vector<char> buf(myStr.begin(), myStr.end());
CxImage(&buf[0],buf.size());

No try/catch block necessary.
 
L

Larry I Smith

red said:
Larry said:
Khuong said:
I have the contents of an image of type std::string. How can I make a
CxImage object with this type.

The parameters to CxImage is:

CxImage(byte* data, DWORD size)

Thx in advance



std::string.c_str() returns a 'const char *' (a pointer
to a string that can not be modified).

std::string.length() returns the data length
of the content.

Since CxImage() takes a 'byte *' rather than a
'const byte *', I would expect that it modifies
the memory pointed to by its first arg ('data').

If CxImage() does NOT modify the memory pointed to by its
first arg ('data'), and your image is in a std::string named
'myStr', then you might try something like this:

CxImage(myStr.c_str(), myStr.length());

You may have to add some casts to convert the first arg
to 'byte *' and the 2nd arg to 'DWORD'.

If CxImage() DOES modify the memory pointed to by its first
arg ('data'), then you could copy the string to a char buf:

try {
char * buf = new char[myStr.length()];
memcpy(buf, myStr.c_str(), myStr.length());
CxImage(buf, myStr.length());
delete[] buf;
}
catch (std::bad_alloc) {
std::cerr << "memory allocation failed\n";
}

Larry

I'd prefer:

std::vector<char> buf(myStr.begin(), myStr.end());
CxImage(&buf[0],buf.size());

No try/catch block necessary.

I believe that the new/memcpy approach would be much faster
(a single alloc and copy versus possible multiple reallocs
and copies as the vector expands itself), and provides a means
to handle errors (try/catch) when the image is large (as many are).
I proposed that approach on the assumption that CxImage() is a 'C'
function rather than a C++ function (based on its use of 'byte'
and DWORD).

Both approaches should work.

Larry
 
A

Alf P. Steinbach

First, to the OP: be sure to check whether CxImage _copies_ the specified
data or not, and if not, whether it takes over deallocation responsibility,
in which case you'll have to allocate the data correspondingly.

* Larry I Smith:
* red floyd:
* Larry I Smith:
If CxImage() DOES modify the memory pointed to by its first
arg ('data'), then you could copy the string to a char buf:

try {
char * buf = new char[myStr.length()];
memcpy(buf, myStr.c_str(), myStr.length());
CxImage(buf, myStr.length());
delete[] buf;
}
catch (std::bad_alloc) {
std::cerr << "memory allocation failed\n";
}

I'd prefer:

std::vector<char> buf(myStr.begin(), myStr.end());
CxImage(&buf[0],buf.size());

No try/catch block necessary.

I believe that the new/memcpy approach would be much faster

Nope.

Premature "optimization" is the root of all evil.

In this case the "optimization" makes for more complex and brittle code,
with no speedwise advantage (and no other advantage whatsoever).

(a single alloc and copy versus possible multiple reallocs
and copies as the vector expands itself),

It doesn't, in any acceptable quality implementation.

And if that should turn out to be a problem, simply declare the vector with
the required initial capacity.

and provides a means
to handle errors (try/catch) when the image is large (as many are).

Note that your code doesn't do that correctly: it doesn't deallocate, it
doesn't propagate the exception or abort, and it has side-effects (it should
also catch that exception by reference, but that's not very important).

All that can be fixed, but is the usual way things go when you do premature
optimization and choose low-level abstractions instead of higher level ones.

Don't.

I proposed that approach on the assumption that CxImage() is a 'C'
function rather than a C++ function (based on its use of 'byte'
and DWORD).

First, that wouldn't matter. Second, the OP said otherwise. Third, your
own code says otherwise. ;-)
 
L

Larry I Smith

Alf said:
First, to the OP: be sure to check whether CxImage _copies_ the specified
data or not, and if not, whether it takes over deallocation responsibility,
in which case you'll have to allocate the data correspondingly.

* Larry I Smith:
* red floyd:
* Larry I Smith:
If CxImage() DOES modify the memory pointed to by its first
arg ('data'), then you could copy the string to a char buf:

try {
char * buf = new char[myStr.length()];
memcpy(buf, myStr.c_str(), myStr.length());
CxImage(buf, myStr.length());
delete[] buf;
}
catch (std::bad_alloc) {
std::cerr << "memory allocation failed\n";
}
I'd prefer:

std::vector<char> buf(myStr.begin(), myStr.end());
CxImage(&buf[0],buf.size());

No try/catch block necessary.
I believe that the new/memcpy approach would be much faster

Nope.

Premature "optimization" is the root of all evil.

Perhaps, but this particular one (new/memcpy vs vector) has proven
to be much faster in our corporate apps that have to compile/run
on many different platform/OS/compiler combinations, so I'm used to
using it automatically. In future responses here I'll try to
stick to generics.
In this case the "optimization" makes for more complex and brittle code,
with no speedwise advantage (and no other advantage whatsoever).



It doesn't, in any acceptable quality implementation.

And if that should turn out to be a problem, simply declare the vector with
the required initial capacity.



Note that your code doesn't do that correctly: it doesn't deallocate, it
doesn't propagate the exception or abort, and it has side-effects (it should
also catch that exception by reference, but that's not very important).

Sorry, it wasn't meant to be complete, merely an example snip.
It is based on the example code for 'f()' in section '6.2.6.2 Memory
Exhaustion' (page 129) from Stroustrup's "C++ Programming Language
Third Edition". I'll try to post more complete snips in the future.
All that can be fixed, but is the usual way things go when you do premature
optimization and choose low-level abstractions instead of higher level ones.

Don't.



First, that wouldn't matter. Second, the OP said otherwise. Third, your
own code says otherwise. ;-)

I did not interpret the OP's use of the phrase "CxImage object" to
imply that CxImage was an object in the C++ sense; the
CxImage(byte *, DWORD) signature looked to me like an MS Windows
function call (a 3rd party lib API perhaps) - but then again,
I'm not that familiar with MS Windows... :)

Regards,
Larry
 
A

Alf P. Steinbach

* Larry I Smith:
Perhaps, but this particular one (new/memcpy vs vector) has proven
to be much faster in our corporate apps that have to compile/run
on many different platform/OS/compiler combinations, so I'm used to
using it automatically.

Uhm ... extraordinary claims require extraordinary proofs... ;-)

Do you have some (preferentially small) example code the readers of this
thread could discuss & time?
 
L

Larry I Smith

Alf said:
* Larry I Smith:

Uhm ... extraordinary claims require extraordinary proofs... ;-)

Do you have some (preferentially small) example code the readers of this
thread could discuss & time?

By force of habit I used an approach specified by my company's
Design Standards doc. Now you want me to prove that our
Corporate Engineering Council is correct in their design decisions.
As they say, "I just work here". The design standards to which we
must conform were written by folks far smarter than I to ensure
platform portability (win98/2k/xp; various versions of HP/UX, SunOS,
Solaris, Linux, etc, etc, etc). Those design standards forbid,
or restrict, our use of many common C++ features because they may
have problems (portability or performance) on one or more of the
supported platforms. Those who work here have no control over
these design standards, so further discussion on it is pointless.

As I stated in my earlier post, I'll make an effort in the
future to not impose those design limitations on code
snips I post here.

As far as the original OP's question (how to get the data from
a std::string into a seperate byte array), I put together a test
program (see below) which compares several approaches. Results
of running the program on WinXP and Linux are in the comments
at the top of the program. I know the code could be greatly
improved (better error handling, etc), but I've already spent
way too much time on this issue.

On Windows all but one of the vector approaches are 5 to 15
times slower than the new/memcpy approach. (I begin to see
why the Design Doc mandates the new/memcpy approach...)

On Linux the vector approaches are comparable to the new/memcpy
approach. :) :)

Excluding the new/memcpy approach, it seems that the following
approach using std::string.c_str() is the most portable
(i.e. matches the new/memcpy approach); although it does depend
on strict left-to-right argument evaluation - which I'm not
allowed to use, but others may be.

// 'str' is a std::string containing the input data.
const char * data;
vector<char> vec(data = str.c_str(), data + str.length());

Regards,
Larry

// vtest.cpp - test vector creation vs new/memcpy.
// 1) tests with std::string iterators as the input data
// 2) tests with std::string.c_str() as the input data
// 3) tests with a char buffer as the input data source.
// to compile:
// Windows: cl /EHsc vtest.cpp
// Linux: g++ -o vtest vtest.cpp

/*
* Sample output using MSVC v7 on WinXP,
* on a P4 2GHZ with 512MB RAM:
* Input test data size bytes is: 10485760
* Test creating a vector from std::string iterators:
* 0.046 seconds - (1x) new,memcpy()
* 0.655 seconds - (14x) vector(first,last)
* 0.686 seconds - (15x) vector(sz),copy()
* 0.608 seconds - (13x) vector(),reserve(sz),copy()
* Test creating a vector from std::string.c_str():
* 0.047 seconds - (1x) new,memcpy()
* 0.046 seconds - (1x) vector(first,last)
* 0.343 seconds - (7x) vector(sz),copy()
* 0.249 seconds - (5x) vector(),reserve(sz),copy()
* Test creating a vector from a char buffer:
* 0.047 seconds - (1x) new,memcpy()
* 0.047 seconds - (1x) vector(first,last)
* 0.343 seconds - (7x) vector(sz),copy()
* 0.249 seconds - (5x) vector(),reserve(sz),copy()
*
* Sample output using g++ v3.3.5 on SuSE Linux 9.3,
* on a P2 450MHZ with 384MB RAM:
* Input test data size bytes is: 10485760
* Test creating a vector from std::string iterators:
* 0.15 seconds - (1x) new,memcpy()
* 0.17 seconds - (1x) vector(first,last)
* 0.17 seconds - (1x) vector(sz),copy()
* 0.15 seconds - (1x) vector(),reserve(sz),copy()
* Test creating a vector from std::string.c_str():
* 0.15 seconds - (1x) new,memcpy()
* 0.15 seconds - (1x) vector(first,last)
* 0.17 seconds - (1x) vector(sz),copy()
* 0.16 seconds - (1x) vector(),reserve(sz),copy()
* Test creating a vector from a char buffer:
* 0.1 seconds - (1x) new,memcpy()
* 0.15 seconds - (1x) vector(first,last)
* 0.21 seconds - (2x) vector(sz),copy()
* 0.15 seconds - (1x) vector(),reserve(sz),copy()
*/

#include <iostream>
#include <cstdlib>
#include <ctime>
#include <string>
#include <vector>
#include <algorithm>

using namespace std;

// print the elapsed time in seconds
double elapsed(const string& txt, clock_t start, clock_t end,
double orig)
{
double et = static_cast<double>(end - start) / CLOCKS_PER_SEC;

cout << " "
<< et
<< " seconds - ";

if (orig > 0.0)
{
cout << " ("
<< static_cast<int>(et/orig + 0.5) << "x)";
}
else
{
cout << " (1x)";
}

cout << " " << txt << endl;

return et;
}

template <class InIt>
void testvec(string::size_type sz, const char * data,
InIt first, InIt last)
{
clock_t start, end;
double orig = 0.0;

// ------------------------------------
// time using new/memcpy to create a char buffer
// copy of the test data
{
char *buf = 0;

start = clock();

// allocate the memory for buf[].
// for this simple program, we just terminate if
// any exception is thrown.
try
{
buf = new char[sz];
}
catch(...)
{
cout << "buf = new[] failed" << endl;

return;
}

memcpy(buf, data, sz);

end = clock();

orig = elapsed("new,memcpy()", start, end, orig);

delete[] buf;
}

// ------------------------------------
// time creating a vector from the test data using
// the vector's constructor
{
start = clock();

vector<char> vec1(first, last);

end = clock();

elapsed("vector(first,last)" ,start, end, orig);
}

// ------------------------------------
// time creating a vector from the test data using a
// pre-sized vector and copy()
{
start = clock();

vector<char> vec2(sz);

copy(first, last, vec2.begin());

end = clock();

elapsed("vector(sz),copy()", start, end, orig);
}

// ------------------------------------
// time creating a vector from the test data using
// vector.reserve() and copy()
{
start = clock();

vector<char> vec3;

vec3.reserve(sz);

copy(first, last, vec3.begin());

end = clock();

elapsed("vector(),reserve(sz),copy()", start, end, orig);
}

return;
}

int main()
{
// we'll use 10MB as the test data size
string::size_type sz = (1024 * 1024 * 10);

cout << "Input test data size bytes is: " << sz << endl;

// test vector using string iterators to access the input data
{
cout << "Test creating a vector from"
<< " std::string iterators:" <<endl;

// make a test string holding 'sz' blanks
string str(sz, ' ');

testvec(sz, str.c_str(), str.begin(), str.end());
}

// test vector using string.c_str() as input data
{
cout << "Test creating a vector from"
<< " std::string.c_str():" <<endl;

const char * data;

// make a test string holding 'sz' blanks
string str(sz, ' ');

// WARNING: this assumes that left-to-right
// argument evaluation is guaranteed
// by the compiler.
testvec(sz, data = str.c_str(), data, data + sz);
}

// test vector using a char buffer as input data
{
cout << "Test creating a vector from a"
<< " char buffer:" <<endl;

try
{
// make a char buffer holding 'sz' blanks
char * data = new char[sz];

memset(data, ' ', sz);

testvec(sz, data, data, data + sz);

delete[] data;
}
catch(...)
{
cout << "data = new[] failed" << endl;
}
}

return 0;
}
 
A

Alf P. Steinbach

* Larry I Smith:
As far as the original OP's question (how to get the data from
a std::string into a seperate byte array), I put together a test
program (see below) which compares several approaches. Results
of running the program on WinXP and Linux are in the comments
at the top of the program. I know the code could be greatly
improved (better error handling, etc), but I've already spent
way too much time on this issue.

On Windows all but one of the vector approaches are 5 to 15
times slower than the new/memcpy approach. (I begin to see
why the Design Doc mandates the new/memcpy approach...)

That's not what I get, except with a debug build, which should not be used
for timing.

/*
* Sample output using MSVC v7 on WinXP,
* on a P4 2GHZ with 512MB RAM:
* Input test data size bytes is: 10485760
* Test creating a vector from std::string iterators:
* 0.046 seconds - (1x) new,memcpy()
* 0.655 seconds - (14x) vector(first,last)
* 0.686 seconds - (15x) vector(sz),copy()
* 0.608 seconds - (13x) vector(),reserve(sz),copy()
* Test creating a vector from std::string.c_str():
* 0.047 seconds - (1x) new,memcpy()
* 0.046 seconds - (1x) vector(first,last)
* 0.343 seconds - (7x) vector(sz),copy()
* 0.249 seconds - (5x) vector(),reserve(sz),copy()
* Test creating a vector from a char buffer:
* 0.047 seconds - (1x) new,memcpy()
* 0.047 seconds - (1x) vector(first,last)
* 0.343 seconds - (7x) vector(sz),copy()
* 0.249 seconds - (5x) vector(),reserve(sz),copy()

For a slightly modified program (more test cases), MSVC 7.1 on a 1.8 GHz PC
with 256 MiB -- "should" be slower than yours but is way faster (did you
obtain the numbers above with a debug build?):

Input test data size bytes is: 10485760
Test creating a vector from std::string iterators:
0.031 seconds - (1x) new,memcpy() // This line doesn't use iterators.
0.141 seconds - (5x) vector(first,last) <--
0.11 seconds - (4x) vector(sz),copy()
0.235 seconds - (8x) vector(sz),assign()
0.047 seconds - (2x) vector(sz),memcopy()
0.11 seconds - (4x) vector(),reserve(sz),copy()
Test creating a vector from std::string.c_str():
0.031 seconds - (1x) new,memcpy()
0.032 seconds - (1x) vector(first,last) <--
0.126 seconds - (4x) vector(sz),copy()
0.141 seconds - (5x) vector(sz),assign()
0.062 seconds - (2x) vector(sz),memcopy()
0.094 seconds - (3x) vector(),reserve(sz),copy()
Test creating a vector from a char buffer:
0.031 seconds - (1x) new,memcpy()
0.047 seconds - (2x) vector(first,last) <--
0.11 seconds - (4x) vector(sz),copy()
0.125 seconds - (4x) vector(sz),assign()
0.063 seconds - (2x) vector(sz),memcopy()
0.094 seconds - (3x) vector(),reserve(sz),copy()

Of course the times vary somewhat. In a few cases the vector is faster:

Input test data size bytes is: 10485760
Test creating a vector from std::string iterators:
0.047 seconds - (1x) new,memcpy() // This line doesn't use iterators.
0.157 seconds - (3x) vector(first,last) <--
0.11 seconds - (2x) vector(sz),copy()
0.25 seconds - (5x) vector(sz),assign()
0.047 seconds - (1x) vector(sz),memcopy()
0.094 seconds - (2x) vector(),reserve(sz),copy()
Test creating a vector from std::string.c_str():
0.047 seconds - (1x) new,memcpy()
0.031 seconds - (1x) vector(first,last) <--
0.126 seconds - (3x) vector(sz),copy()
0.141 seconds - (3x) vector(sz),assign()
0.062 seconds - (1x) vector(sz),memcopy()
0.094 seconds - (2x) vector(),reserve(sz),copy()
Test creating a vector from a char buffer:
0.047 seconds - (1x) new,memcpy()
0.032 seconds - (1x) vector(first,last) <--
0.125 seconds - (3x) vector(sz),copy()
0.157 seconds - (3x) vector(sz),assign()
0.047 seconds - (1x) vector(sz),memcopy()
0.094 seconds - (2x) vector(),reserve(sz),copy()

With g++ 3.4.2 on the same machine I get essentially the same results,
except one run where new+memcpy fell down to 0.016 secs (not reproducable):

Input test data size bytes is: 10485760
Test creating a vector from std::string iterators:
0.031 seconds - (1x) new,memcpy() // This line doesn't use iterators.
0.032 seconds - (1x) vector(first,last) <--
0.062 seconds - (2x) vector(sz),copy()
0.063 seconds - (2x) vector(sz),assign()
0.047 seconds - (2x) vector(sz),memcopy()
0.031 seconds - (1x) vector(),reserve(sz),copy()
Test creating a vector from std::string.c_str():
0.031 seconds - (1x) new,memcpy()
0.032 seconds - (1x) vector(first,last) <--
0.047 seconds - (2x) vector(sz),copy()
0.047 seconds - (2x) vector(sz),assign()
0.031 seconds - (1x) vector(sz),memcopy()
0.031 seconds - (1x) vector(),reserve(sz),copy()
Test creating a vector from a char buffer:
0.032 seconds - (1x) new,memcpy()
0.047 seconds - (1x) vector(first,last) <--
0.047 seconds - (1x) vector(sz),copy()
0.062 seconds - (2x) vector(sz),assign()
0.047 seconds - (1x) vector(sz),memcopy()
0.047 seconds - (1x) vector(),reserve(sz),copy()

It seems that for those two compilers, MSVC 7.1 and g++ 3.4.2,

void f( std::string const& s )
{
char const* const data = s.c_str();
std::vector v( data, data + s.length() );
//...
}

is nearly always as fast as new+memcpy, sometimes faster, and _much_ safer.
 
R

Richard Herring

Larry I Smith said:
By force of habit I used an approach specified by my company's
Design Standards doc. Now you want me to prove that our
Corporate Engineering Council is correct in their design decisions.

[big snip]
vector<char> vec3;

vec3.reserve(sz);

copy(first, last, vec3.begin());

I hope they didn't mandate *that*! :-(

Incidentally, I don't think this program proves anything at all. By
running the tests consecutively, you're making the later ones dependent
on how the earlier ones mangled free store, so they're not independent.
 
L

Larry I Smith

Alf said:
* Larry I Smith:

That's not what I get, except with a debug build, which should not be used
for timing.

As stated in the code comments, here are the
compile comands I used on Windows and Linux:

// to compile:
// Windows: cl /EHsc vtest.cpp
// Linux: g++ -o vtest vtest.cpp

No "debug build" was involved.



For a slightly modified program (more test cases), MSVC 7.1 on a 1.8 GHz PC
with 256 MiB -- "should" be slower than yours but is way faster (did you
obtain the numbers above with a debug build?):


No "debug build" was used. See my comment above.
The Windows pc is a small Dell desktop; the model escapes
me at the moment (it's at work and I'm at home). It's specs
mentioned in my earlier post were copied from the "Properties"
pop-up of its "My Computer" desktop icon.
 
L

Larry I Smith

Richard said:
Larry I Smith said:
By force of habit I used an approach specified by my company's
Design Standards doc. Now you want me to prove that our
Corporate Engineering Council is correct in their design decisions.

[big snip]
vector<char> vec3;

vec3.reserve(sz);

copy(first, last, vec3.begin());

I hope they didn't mandate *that*! :-(


No, they mandate the new/memcpy approach (always).

Incidentally, I don't think this program proves anything at all. By
running the tests consecutively, you're making the later ones dependent
on how the earlier ones mangled free store, so they're not independent.

Ok. When I have time to write ten seperate programs, I'll look
into it.

Regards,
Larry
 
R

Richard Herring

Larry I Smith said:
Richard said:
Larry I Smith said:
Alf P. Steinbach wrote:
* Larry I Smith:
Perhaps, but this particular one (new/memcpy vs vector) has proven
to be much faster in our corporate apps that have to compile/run
on many different platform/OS/compiler combinations, so I'm used to
using it automatically.

Uhm ... extraordinary claims require extraordinary proofs... ;-)

Do you have some (preferentially small) example code the readers of this
thread could discuss & time?


By force of habit I used an approach specified by my company's
Design Standards doc. Now you want me to prove that our
Corporate Engineering Council is correct in their design decisions.

[big snip]
vector<char> vec3;

vec3.reserve(sz);

copy(first, last, vec3.begin());

I hope they didn't mandate *that*! :-(


No, they mandate the new/memcpy approach

I fear you may have missed the point. In case anyone else did, the above
is UB. reserve(sz) or not, vec3.size() is 0.
(always).

And what do they mandate for dynamic arrays of rule-of-3 objects?
 
L

Larry I Smith

Richard said:
Larry I Smith said:
Richard said:
In message <iVwOe.10620$g47.1857@trnddc07>, Larry I Smith
Alf P. Steinbach wrote:
* Larry I Smith:
Perhaps, but this particular one (new/memcpy vs vector) has proven
to be much faster in our corporate apps that have to compile/run
on many different platform/OS/compiler combinations, so I'm used to
using it automatically.

Uhm ... extraordinary claims require extraordinary proofs... ;-)

Do you have some (preferentially small) example code the readers of
this
thread could discuss & time?


By force of habit I used an approach specified by my company's
Design Standards doc. Now you want me to prove that our
Corporate Engineering Council is correct in their design decisions.

[big snip]


vector<char> vec3;

vec3.reserve(sz);

copy(first, last, vec3.begin());

I hope they didn't mandate *that*! :-(


No, they mandate the new/memcpy approach

I fear you may have missed the point. In case anyone else did, the above
is UB. reserve(sz) or not, vec3.size() is 0.


Ahhh yes, that's what happens when one codes after midnight...

You are correct, the above code should be this:

vector<char> vec3;

vec3.assign(first, last);

And what do they mandate for dynamic arrays of rule-of-3 objects?

The new/memcpy rule applies only to native types (char, int, double,
etc).

Regards,
Larry
 
L

Larry I Smith

Richard Herring wrote:

[snip]
Incidentally, I don't think this program proves anything at all. By
running the tests consecutively, you're making the later ones dependent
on how the earlier ones mangled free store, so they're not independent.

I split it into multiple programs and tested on
Linux and Windows.

The times remained within 1% to 2% of those reported
by the original all-in-one program. A few times
went up (from 0.5% to 1.5%), a few went down (from
0.5% to 2%), and a few remained the same.

So, the original all-in-one program is just fine
for testing purposes.

Regards,
Larry
 
L

Larry I Smith

Larry said:
By force of habit I used an approach specified by my company's
Design Standards doc. Now you want me to prove that our
Corporate Engineering Council is correct in their design decisions.
As they say, "I just work here". The design standards to which we
must conform were written by folks far smarter than I to ensure
platform portability (win98/2k/xp; various versions of HP/UX, SunOS,
Solaris, Linux, etc, etc, etc). Those design standards forbid,
or restrict, our use of many common C++ features because they may
have problems (portability or performance) on one or more of the
supported platforms. Those who work here have no control over
these design standards, so further discussion on it is pointless.

As I stated in my earlier post, I'll make an effort in the
future to not impose those design limitations on code
snips I post here.

As far as the original OP's question (how to get the data from
a std::string into a seperate byte array), I put together a test
program (see below) which compares several approaches. Results
of running the program on WinXP and Linux are in the comments
at the top of the program. I know the code could be greatly
improved (better error handling, etc), but I've already spent
way too much time on this issue.

On Windows all but one of the vector approaches are 5 to 15
times slower than the new/memcpy approach. (I begin to see
why the Design Doc mandates the new/memcpy approach...)

On Linux the vector approaches are comparable to the new/memcpy
approach. :) :)

Excluding the new/memcpy approach, it seems that the following
approach using std::string.c_str() is the most portable
(i.e. matches the new/memcpy approach); although it does depend
on strict left-to-right argument evaluation - which I'm not
allowed to use, but others may be.

// 'str' is a std::string containing the input data.
const char * data;
vector<char> vec(data = str.c_str(), data + str.length());

Regards,
Larry

// vtest.cpp - test vector creation vs new/memcpy.
// 1) tests with std::string iterators as the input data
// 2) tests with std::string.c_str() as the input data
// 3) tests with a char buffer as the input data source.
// to compile:
// Windows: cl /EHsc vtest.cpp
// Linux: g++ -o vtest vtest.cpp

/*
* Sample output using MSVC v7 on WinXP,
* on a P4 2GHZ with 512MB RAM:


Correction to the above two lines:

Sample output using MS Visual Studio .NET 2003 on Win-XP (SP2),
on a Dell Optiplex GX240 (P4 2GHZ) with 512MB of RAM:

* Input test data size bytes is: 10485760
* Test creating a vector from std::string iterators:
* 0.046 seconds - (1x) new,memcpy()
* 0.655 seconds - (14x) vector(first,last)
* 0.686 seconds - (15x) vector(sz),copy()
* 0.608 seconds - (13x) vector(),reserve(sz),copy()
* Test creating a vector from std::string.c_str():
* 0.047 seconds - (1x) new,memcpy()
* 0.046 seconds - (1x) vector(first,last)
* 0.343 seconds - (7x) vector(sz),copy()
* 0.249 seconds - (5x) vector(),reserve(sz),copy()
* Test creating a vector from a char buffer:
* 0.047 seconds - (1x) new,memcpy()
* 0.047 seconds - (1x) vector(first,last)
* 0.343 seconds - (7x) vector(sz),copy()
* 0.249 seconds - (5x) vector(),reserve(sz),copy()
*
* Sample output using g++ v3.3.5 on SuSE Linux 9.3,
* on a P2 450MHZ with 384MB RAM:


Correction to the above line:

on a Gateway E-4200 (P2 450MHZ) with 384MB of RAM:

* Input test data size bytes is: 10485760
* Test creating a vector from std::string iterators:
* 0.15 seconds - (1x) new,memcpy()
* 0.17 seconds - (1x) vector(first,last)
* 0.17 seconds - (1x) vector(sz),copy()
* 0.15 seconds - (1x) vector(),reserve(sz),copy()
* Test creating a vector from std::string.c_str():
* 0.15 seconds - (1x) new,memcpy()
* 0.15 seconds - (1x) vector(first,last)
* 0.17 seconds - (1x) vector(sz),copy()
* 0.16 seconds - (1x) vector(),reserve(sz),copy()
* Test creating a vector from a char buffer:
* 0.1 seconds - (1x) new,memcpy()
* 0.15 seconds - (1x) vector(first,last)
* 0.21 seconds - (2x) vector(sz),copy()
* 0.15 seconds - (1x) vector(),reserve(sz),copy()
*/

#include <iostream>
#include <cstdlib>
#include <ctime>
#include <string>
#include <vector>
#include <algorithm>

using namespace std;

// print the elapsed time in seconds
double elapsed(const string& txt, clock_t start, clock_t end,
double orig)
{
double et = static_cast<double>(end - start) / CLOCKS_PER_SEC;

cout << " "
<< et
<< " seconds - ";

if (orig > 0.0)
{
cout << " ("
<< static_cast<int>(et/orig + 0.5) << "x)";
}
else
{
cout << " (1x)";
}

cout << " " << txt << endl;

return et;
}

template <class InIt>
void testvec(string::size_type sz, const char * data,
InIt first, InIt last)
{
clock_t start, end;
double orig = 0.0;

// ------------------------------------
// time using new/memcpy to create a char buffer
// copy of the test data
{
char *buf = 0;

start = clock();

// allocate the memory for buf[].
// for this simple program, we just terminate if
// any exception is thrown.
try
{
buf = new char[sz];
}
catch(...)
{
cout << "buf = new[] failed" << endl;

return;
}

memcpy(buf, data, sz);

end = clock();

orig = elapsed("new,memcpy()", start, end, orig);

delete[] buf;
}

// ------------------------------------
// time creating a vector from the test data using
// the vector's constructor
{
start = clock();

vector<char> vec1(first, last);

end = clock();

elapsed("vector(first,last)" ,start, end, orig);
}

// ------------------------------------
// time creating a vector from the test data using a
// pre-sized vector and copy()
{
start = clock();

vector<char> vec2(sz);

copy(first, last, vec2.begin());

end = clock();

elapsed("vector(sz),copy()", start, end, orig);
}

// ------------------------------------
// time creating a vector from the test data using
// vector.reserve() and copy()
{
start = clock();

vector<char> vec3;

vec3.reserve(sz);

//copy(first, last, vec3.begin());


As Richard Herring pointed out, the above copy() is incorrect.
It should be:

vec3.assign(first, last);

end = clock();

//elapsed("vector(),reserve(sz),copy()", start, end, orig);

elapsed("vector(),reserve(sz),assign()", start, end, orig);
}

return;
}

int main()
{
// we'll use 10MB as the test data size
string::size_type sz = (1024 * 1024 * 10);

cout << "Input test data size bytes is: " << sz << endl;

// test vector using string iterators to access the input data
{
cout << "Test creating a vector from"
<< " std::string iterators:" <<endl;

// make a test string holding 'sz' blanks
string str(sz, ' ');

testvec(sz, str.c_str(), str.begin(), str.end());
}

// test vector using string.c_str() as input data
{
cout << "Test creating a vector from"
<< " std::string.c_str():" <<endl;

const char * data;

// make a test string holding 'sz' blanks
string str(sz, ' ');

// WARNING: this assumes that left-to-right
// argument evaluation is guaranteed
// by the compiler.
testvec(sz, data = str.c_str(), data, data + sz);
}

// test vector using a char buffer as input data
{
cout << "Test creating a vector from a"
<< " char buffer:" <<endl;

try
{
// make a char buffer holding 'sz' blanks
char * data = new char[sz];

memset(data, ' ', sz);

testvec(sz, data, data, data + sz);

delete[] data;
}
catch(...)
{
cout << "data = new[] failed" << endl;
}
}

return 0;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,023
Latest member
websitedesig25

Latest Threads

Top