direct access to the buffer of std::string

P

__PPS__

Hello,
I have a function that does some decoding (same as urldecode in php)
and it returns std::string
std::string urldecode(const char* in, std::size_t length);
so, this function parses input and constructes output char by char. I
tried to use all sorts of direct access to the buffer of std::string to
be returned but it's always slow. It appered that if I dynamically
allocate memory fill it with output, then create std::string from this
buffer of chars then delete this buffer and return this string is at
least 2 times faster than using std::string with reserved and resized
buffer and then accessing this string trough operator[]. (simple
appending using += <char> slows down thing even 10 times!)

Are there any ways I can make it work in a simple fast way without
recopying memory

Thanks
 
P

Pete Becker

__PPS__ said:
Hello,
I have a function that does some decoding (same as urldecode in php)
and it returns std::string
std::string urldecode(const char* in, std::size_t length);
so, this function parses input and constructes output char by char. I
tried to use all sorts of direct access to the buffer of std::string to
be returned but it's always slow. It appered that if I dynamically
allocate memory fill it with output, then create std::string from this
buffer of chars then delete this buffer and return this string is at
least 2 times faster than using std::string with reserved and resized
buffer and then accessing this string trough operator[]. (simple
appending using += <char> slows down thing even 10 times!)

There's no reason for operator[] to be slow. operator+=, on the other
hand, has to check whether there's enough storage allocated.
 
P

__PPS__

if anyone interested, here's results of my observation:
I used gcc322 and vc71 and in both cases it was (of course) faster to
create dynamic array of chars and put data into it and return it.
The second fastest way is not trough operator[]! In fact, using
std::string::iterator I got ~10% faster execution. Results were the
same if I used c_str() casted to char* and then using this pointer as
though I had a raw array of chars (for g++ iterator version was slower
than the case with c_str()), basicly it's faster to do ++ on a pointer
and access value pointed by it, than every time to calculate offset
from some address (operator[offset]).
All the overhead with std::string came from that before using iterators
or op[] I had to resize string, so that it could contain maxim output
length and it means that newly allocated buffer for string has to be
filled with zeros. Then after function is finished I need to resize
again if the real size is less than the intially allocated. And some
overhead comes from that I use return by value.
 
L

Larry I Smith

__PPS__ wrote:
All the overhead with std::string came from that before using iterators
or op[] I had to resize string, so that it could contain maxim output
length and it means that newly allocated buffer for string has to be
filled with zeros. Then after function is finished I need to resize
again if the real size is less than the intially allocated. And some
overhead comes from that I use return by value.

You don't have to fill a std::string with zeros - std::string does
not use char '\0' to denote the end-of-string.
Why do you need to resize to a smaller size before returning
the std::string? The 'length' of the string does not
depend on the size of the raw buffer. It seems that you
are confusing the requirements of 'C' nul-terminated strings
with the C++ std::string.

Regards,
Larry
 
P

__PPS__

You don't understand what I meant.
std::string::iterator (if you read my post) is the fastest way to fill
a string with characters. operator[] is also a good way, but with both
of them (not sure about iterator, but certanly with operator[]) you can
easily go over the allocated buffer (access violation). If, for
example, I know that my string will be some size at the end it's better
to .reserve(newsize) some memory so that latter, as your string grows,
it will not be reallocated multiple times to allow more characters.
Reserve only allocates memory - it doesn't change string's length, and
it's a good way to preallocate storage for the case when you use
append, += or insert methods on the string that change it's size. Since
I use iterator (or operator[]) I have to .resize(newsize) not reserve,
which fills extra allocated chars with zero. Then at the end I need to
resize once again to make my string proper size, to cut off extra
characters. If I used reserve instead of the first resize then the
second resize would overide my string with zeros...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,781
Messages
2,569,615
Members
45,293
Latest member
Hue Tran

Latest Threads

Top