Efficient, safe handling of "char *"

V

Vaca Louca

Hello,

I write an ISAPI authentication module which uses Berkeley DB and
want it to be as efficient as possible.

Both ISAPI and BerkeleyDB use arrays of chars (char *)
to pass and receive information.

I know that C++ strings are supposed to be the "right" way to
handle strings but I suspect that converting "char *" to string
every time I deal with it is costly (allocate memory, copy the
content), especially when many times I'll have to convert the
string back to "char *" (using c_str()) to pass it onward to the
other side or return a result.

Are there known class libraries which can help me somehow:

1. create an object which accepts a "char *" (either
null-terminated or with a length argument)
2. represent this array of chars as an STL container (provide
iterators, mostly, but also allow modification of the array
through the iterators)
3. provides a handle to the original "char *" argument

without copying over the array of bytes?

I googled around and Boost seems like a likely candidate, but its
documentation is a bit sporadic and I'm struggling to figure out
what and how to use it.

Thanks,

--V
 
I

Ioannis Vranos

Vaca said:
Hello,

I write an ISAPI authentication module which uses Berkeley DB and
want it to be as efficient as possible.

Both ISAPI and BerkeleyDB use arrays of chars (char *)
to pass and receive information.

I know that C++ strings are supposed to be the "right" way to
handle strings but I suspect that converting "char *" to string
every time I deal with it is costly (allocate memory, copy the
content), especially when many times I'll have to convert the
string back to "char *" (using c_str()) to pass it onward to the
other side or return a result.


You shouldn't pass the returned pointer of c_str() anywhere directly.

Are there known class libraries which can help me somehow:

1. create an object which accepts a "char *" (either
null-terminated or with a length argument)
2. represent this array of chars as an STL container (provide
iterators, mostly, but also allow modification of the array
through the iterators)
3. provides a handle to the original "char *" argument


std::string. :) Just place a string object in a namespace (or global) scope, or make it
static in a local function scope or as a static member of a class.

I would begin with the namespace scope. And in this case you could pass the returned
pointer of c_str() as a const char * around.
 
I

Ioannis Vranos

Ioannis said:
I would begin with the namespace scope. And in this case you could pass
the returned pointer of c_str() as a const char * around.


.... while the string remains unmodified.
 
G

Gianni Mariani

Vaca said:
Hello,

I write an ISAPI authentication module which uses Berkeley DB and
want it to be as efficient as possible.

Both ISAPI and BerkeleyDB use arrays of chars (char *)
to pass and receive information.

I know that C++ strings are supposed to be the "right" way to
handle strings but I suspect that converting "char *" to string
every time I deal with it is costly (allocate memory, copy the
content), especially when many times I'll have to convert the
string back to "char *" (using c_str()) to pass it onward to the
other side or return a result.

#1. You may be right - but have you profiled the code and do you "know"
that this will be an issue.

#2. Have you considered a "lazy string" that converts to std::string
only when you have to. i.e. The constructor takes a const char * and
when any operation is done that is more than simply using a char * it is
converted to std::string.
Are there known class libraries which can help me somehow:

1. create an object which accepts a "char *" (either
null-terminated or with a length argument)
2. represent this array of chars as an STL container (provide
iterators, mostly, but also allow modification of the array
through the iterators)
3. provides a handle to the original "char *" argument

without copying over the array of bytes?

I've written many of these.
I googled around and Boost seems like a likely candidate, but its
documentation is a bit sporadic and I'm struggling to figure out
what and how to use it.

What is it ?
 
V

Vaca Louca

You shouldn't pass the returned pointer of c_str() anywhere directly.


Why not? I know that the receiving methods will not change that
string,
only read it. Also if I could pass the original "char *" I received
from the other
side of the "data path" then it would have been OK for the receiving
methods
to change the string if they wished to.
std::string. :) Just place a string object in a namespace (or global) scope, or make it
static in a local function scope or as a static member of a class.

The point I'm trying to make is that I don't create these strings - I
get them as
"char *" from the various interfaces and many times I pass them (or
parts of them)
to other interfaces which also expect "char *". Right now I use
std::string in the
middle but, for instance, if I were writing in C and not C++ then I
could have
just passed a pointer along and avoid the copying of the content.
I would begin with the namespace scope. And in this case you could pass the
returned
pointer of c_str() as a const char * around.

I didn't follow this comment - I'm pretty rusty with C++ (and have a
sizeable pile
of C++ books hear with me) so I might be missing an obvious point. What
do
namespaces have to do with this? Do you suggest that I use some sort of
class-globals? It's not practical even if just because the code should
be multi-thread
safe.

Thanks.
 
V

Vaca Louca

#1. You may be right - but have you profiled the code and do you
"know"
that this will be an issue.

Admittedly not much. And from the little timing I did on my code it's
pretty
fast. But I can't avoid the awkward feeling of doing:

char *cp(ISAPI->userName);
string x(cp);
....
some_db_func(x.c_str());

Instead of being able to do:

some_db_func(ISAPI->userName);

I mean - of course I can do what I want, but I'd like to wrap this
"ISAPI->userName" with a C++ class which will watch string boundaries
for me and allow me treat that array of chars as some sort of a
std::vector.
#2. Have you considered a "lazy string" that converts to std::string
only when you have to. i.e. The constructor takes a const char * and
when any operation is done that is more than simply using a char * it is
converted to std::string.

No. Not as such but I was thinking on something maybe similar along
the lines of extending std::vector.

I suppose such a class will have to derive from std::string and
implement
all its interfaces?

Haven't anyone done this already? I suppose I'm far from being the
first
one who stambles on this.
What is it ?

http://boost.org.

A large library of templates, classes and algorithms, some of them are
being considered by the C++ standars committee for the next standard
library update. They have some templates which handle shared arrays
and safe arrays which I suspect might help me but I haven't yet figured
out how.

Thanks.
 
G

Gianni Mariani

Vaca Louca wrote:
....
http://boost.org.

A large library of templates, classes and algorithms, some of them are
being considered by the C++ standars committee for the next standard
library update. They have some templates which handle shared arrays
and safe arrays which I suspect might help me but I haven't yet figured
out how.

Do these things have a name ?
 
V

Vaca Louca

What do you mean? The library is called "Boost".

The part of it which I suspect (very vaguely) that might help
me are the shared pointers.
 
I

Ioannis Vranos

Vaca said:
Why not? I know that the receiving methods will not change that
string,
only read it.


I am not sure what you mean by "receiving methods", however what I meant is that the
returned pointer points to an implementation-of-the-std::string defined space, which may
change after the call (after some string modifying operation probably).


Strictly speaking, we can't even be sure whether the pointed area will continue to be
valid after any subsequent operation on the string (even a subsequent call to c_str() -
although this would probably be inefficient), since it is upon the implementation to
decide how to implement the std::string facilities.


So strictly speaking, even when not modifying the string, even by keeping it alive by
making it global or anything (the approaches that I mentioned previously), we can't be
sure that the c_str() returned pointer will remain valid, after the next string operation,
even if it is non-modifying.

The point I'm trying to make is that I don't create these strings - I
get them as
"char *" from the various interfaces and many times I pass them (or
parts of them)
to other interfaces which also expect "char *". Right now I use
std::string in the
middle but, for instance, if I were writing in C and not C++ then I
could have
just passed a pointer along and avoid the copying of the content.


Well if you have efficiency concerns (that is you are experiencing efficiency problems and
you use a profiler that indicates that the particular string creation and destruction is
the problem), then you should pass the char * itself around.


However if you are not experiencing any efficiency problems caused by string creation and
destruction, then you may keep using them.
 
V

Vaca Louca

Hi,

Thanks for the warning. I'm aware of these issues. I pass the strings
to functions to use as input (e.g. to store to a database file or as a
key
in a lookup) and they are not used after those functions return. There
are also no thread-safety issues here since the pointers are not
passed between threads.

Also, since I use std::strings I am careful to modify them through
"legal"
std::string operations (e.g. use string.erase() to chop off the end bit
instead of sticking a '\0' in the middle of the c_str()).

As for efficiency - right now I'll stick to std::string and try to see
if/when
I get around to profile things whether this is an issue.

I'd just prefer to try to avoid c-like "char *" manipulation code since
it's
probably more fragile and might make the code a little less
maintainable
in the long run.

Thanks.
 
I

Ioannis Vranos

Vaca said:
Hi,

Thanks for the warning. I'm aware of these issues. I pass the strings
to functions to use as input (e.g. to store to a database file or as a
key
in a lookup) and they are not used after those functions return. There
are also no thread-safety issues here since the pointers are not
passed between threads.

Also, since I use std::strings I am careful to modify them through
"legal"
std::string operations (e.g. use string.erase() to chop off the end bit
instead of sticking a '\0' in the middle of the c_str()).

As for efficiency - right now I'll stick to std::string and try to see
if/when
I get around to profile things whether this is an issue.

I'd just prefer to try to avoid c-like "char *" manipulation code since
it's
probably more fragile and might make the code a little less
maintainable
in the long run.

Thanks.


Yes, unfortunately your problems arise from the fact that you are using C (or C-style)
APIs, since they could use std::string or some other class instead.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,066
Latest member
VytoKetoReviews

Latest Threads

Top