Utilising "std::string" in function declaration

K

Kei

Hi!

I know this question has been asked numerous times, and I have read
previous posts, but I would like to ask anyway :/

From previous posts, I realise it's safe to return a string, like the
following class method:

std::string MyFileReader::getFileInfo(std::string path);

But when I look at other codes, people seem to have different styles
when declaring a function like above. For example

// --(1)-- pass by reference
int MyFileReader::getFileInfo(std::string path, std::string &theInfo);

// --(2)-- const class ref
int MyFileReader::getFileInfo(const std::string &path, std::string
&theInfo);

// --(3)-- use char*
int MyFileReader::getFileInfo(const char* path, std::string &theInfo);

The C++ FAQ Lite seem to endorse using "const SomeClass &classRef" to
strictly prevenet modification of the class.. (No. 2), but I don't see
why I should not use constant char*. Is char* evil? I coded in C so
I'm fairly comfortable with char*

I just want to know, what is the rationale of each use, and how do you
choose which to use?

Thanks in advance!

Kei
 
M

Michael DOUBEZ

Kei said:
I know this question has been asked numerous times, and I have read
previous posts, but I would like to ask anyway :/

From previous posts, I realise it's safe to return a string, like the
following class method:

std::string MyFileReader::getFileInfo(std::string path);

But when I look at other codes, people seem to have different styles
when declaring a function like above. For example

// --(1)-- pass by reference
int MyFileReader::getFileInfo(std::string path, std::string &theInfo);

You mean pass by value. or is it a typo and you intended:
int MyFileReader::getFileInfo(std::string& path, std::string &theInfo);
I guess you meant pass by value. The pass by reference is an output
semantic.

This is fine when string use copy-on-write (COW) which is no longer the
case in the general case because of threading issues.
// --(2)-- const class ref
int MyFileReader::getFileInfo(const std::string &path, std::string
&theInfo);

const reference is the general input semantic for big objects, which is
likely for string (except when using the small string optimization).
// --(3)-- use char*
int MyFileReader::getFileInfo(const char* path, std::string &theInfo);

This is what the STL does for std::fstream. It is not that bad in the
general case. :)
It forces to get the underlying array of char from a string ( c_str() )
; depending of the implementation of string, this may incur some costs.
The C++ FAQ Lite seem to endorse using "const SomeClass &classRef" to
strictly prevenet modification of the class.. (No. 2), but I don't see
why I should not use constant char*. Is char* evil? I coded in C so
I'm fairly comfortable with char*

It is just that the standard doesn't guarantee that invoking c_str() is
genuine so you may have surprises in performances but, it it doesn't
matter, it is acceptable.
I just want to know, what is the rationale of each use, and how do you
choose which to use?

The conservative way is to use const reference.
 
S

SG

I just want to know, what is the rationale of each use, and how do you
choose which to use?
std::string MyFileReader::getFileInfo(std::string path);

int MyFileReader::getFileInfo(std::string path, std::string &theInfo);

int MyFileReader::getFileInfo(const std::string &path, std::string
&theInfo);

int MyFileReader::getFileInfo(const char* path, std::string &theInfo);

It depends on what you do inside getFileInfo and how you use this
function. In case you don't need the 'int' return value I'd suggest to
use

std::string MyFileReader::getFilenfo(const std::string& path);

This is under the assumption that you don't need to copy/modify the
string object 'path'. If you need to add a file extension (or modify
the string in some other way) without changing the caller's argument
('path') you might want to use

std::string MyFileReader::getFilenfo(std::string path);

In situations where the function's argument is an rvalue some
compilers are able to elide uncecessary copy operations -- at least G+
+ is smart enough to do this.

For the result you can obviously return a string by value or use a non-
const reference to some string that is going to be modified as
additional (output) parameter. Most compilers (at least GNU's G++ and
Microsoft's Visual C++ compiler) do "return value optimization" which
makes both options to differ little when you want to create a new
string object:

void foo1(string& out);
string foo2();

string s1;
foo1(s1);

string s2 = foo2(); // not more expensive with RVO

But there's no guarantee that compilers do RVO. I wouldn't worry about
it for std::string, though. If the target string object already
exists ...

s1 = foo2(); // s1 has been constructed before

this will invoke string::eek:perator= which may copy the string including
a possible re-allocation of memory. (It depends on the
implementation). In this use case the getFileInfo version that takes a
reference to the target string could be slightly faster. Unless that
part of your program is really performance critical (it probably
isn't) I would still use return by value just because of the syntax
being easier.

HTH,
SG
 
C

Christopher

But when I look at other codes, people seem to have different styles
when declaring a function like above. For example

// --(1)-- pass by reference
int MyFileReader::getFileInfo(std::string path, std::string &theInfo);

This seems silly, is the method going to change the path?
Is there a reason to make a copy of the path?

This is making a non constant copy of the string variable path
It is passing non const sring theInfo by reference so that it may be
changed inside the method and those changes will be hold outside the
method.

// --(2)-- const class ref
int MyFileReader::getFileInfo(const std::string &path, std::string
&theInfo);

This is better.
path is not changing and it is not copied.
theInfo is changing and it is not copied.

// --(3)-- use char*
int MyFileReader::getFileInfo(const char* path, std::string &theInfo);

This is someone who mixes c-style with c++
they wanted to be compatible with a c style caller, however even is it
was a string instead, they could still pass a const char * and it
would be constructed into a string.


The C++ FAQ Lite seem to endorse using "const SomeClass &classRef"
strictly prevenet modification of the class..

I do too :)

(No. 2), but I don't see
why I should not use constant char*. Is char* evil? I coded in C so
I'm fairly comfortable with char*

Its OK if you are careful, but it is preferred to use a string in C++
with a const char * you will have to be aware of the lifetime of
characters it is pointing to.
Also be wary when you are working with more than one module and one
goes out of scope.
 
J

James Kanze

You mean pass by value.

Or is he talking about the "return value"? It's not clear.

If he's talking about the return value, the two functions do
something different: the second provides for a return code, e.g.
to indicate that the search has failed. The usual alternative
here is to use Fallible, although the classic Fallible only
supports a boolean "return code" (failed or succeeded). (I
recently needed more, and have modified my version of Fallible
to support an extended status code.)

At any rate, if you don't need the error code, you should return
the value, as in the first example. Until the profiler says you
can't---not very likely with std::string, or in general, for
that matter.
or is it a typo and you intended:
int MyFileReader::getFileInfo(std::string& path, std::string &theInfo);
I guess you meant pass by value. The pass by reference is an
output semantic.
This is fine when string use copy-on-write (COW) which is no
longer the case in the general case because of threading
issues.

Pass by value is fine until the profiler explicitly tells you it
isn't. Or until house specific coding guidelines say otherwise.
const reference is the general input semantic for big objects,
which is likely for string (except when using the small string
optimization).

It's the general input semantic, except when it isn't:). A
priori, you don't know the size of an object (or shouldn't have
to).

The most common coding guideline I've seen has been to pass
class types by const reference, everything else by value. Which
is simple, easy to understand, and of no help in templates. (In
the STL, iterators and predicates are passed by value; most
other things of unknown type by const reference.)
This is what the STL does for std::fstream. It is not that bad in the
general case. :)
It forces to get the underlying array of char from a string (
c_str() ) ; depending of the implementation of string, this
may incur some costs.

Not likely, as the upcoming version of the standard will require
strings to be contiguous, as is already the case with vector.
It is just that the standard doesn't guarantee that invoking
c_str() is genuine so you may have surprises in performances
but, it it doesn't matter, it is acceptable.

Yes and no. The semantics are not at all the same, and calling
c_str() results in a loss of information---in the worst case
(the string contained '\0' characters), the actual string passed
to the function is different, but even without that, it means
that the function must work in terms of char const*, with all
that implies---from experience, most of the functions I write
that take a char const* (for reasons of compatibility with
legacy software) start by converting it to std::string, so that
I have access to all of the information immediately.
The conservative way is to use const reference.

The convervative way is to use pass by value:). The "correct"
way is to conform to the local standards and practices.
 
D

Daniel T.

I just want to know, what is the rationale of each use, and how do you
choose which to use?

There are a lot of different ways to pass information into a function
and get information out. Here is my general rationale for each use,
spicific situations may be different:

First about parameters passed in:

foo(T t);
This is the base-line baring any special situations, this is what
I use.

foo(const T& t);
This is a special optimization that I use when I don't plan on
modifying the parameter passed in (which is most of the time) and I
have reason to believe that object creation is expensive.

foo(const T* t);
I use this when 't' is an array of objects.

foo(const char*) vs foo(string):
I use the first one when I am crossing library boundries.
std::string is a template and not all compilers compile templates the
same way, so using const char* is safer.


For getting information out of a function:

const T foo();
This is the standard case. Note the 'const' is not required for
intrinsic types (bool, int, double and such) because the compiler
disallows their modification.

const T& foo();
This is an optimization for when I know that the object returned
will outlive the function call, and I have reason to believe that
object creation is expensive.

foo(T& t);
I generally only use this construct for non-member-functions if
someone is forcing me to.
 
J

Juha Nieminen

Christopher said:
(No. 2), but I don't see

Its OK if you are careful, but it is preferred to use a string in C++
with a const char * you will have to be aware of the lifetime of
characters it is pointing to.

Making a function take a const char* rather than a std::string may be
a question of efficiency.

If the function takes a std::string, and you give it a C string, then
a std::string object will be constructed from that C string. This
involves the std::string constructor calculating the length of the C
string (in linear time), allocating memory for it with 'new', then
copying the contents (again in linear time). After the string has been
used, a 'delete' will be executed.

However, if all you wanted was to pass it a C string constant, and
internally the function doesn't need the extra functionality of
std::string, all those extra steps are basically useless overhead which
benefits no-one. If the function in question gets called a lot, it may
even become inefficient.

If the function takes a const char*, however, and depending on what
the function does to that string, there might not be any copying, any
allocations or deallocations, or any other such extra overhead.

Of course now if what you have is a std::string and you want to pass
it to that function, you will have to do it with the aid of c_str(),
which is a minor annoyance.

One solution is to overload the function in question: Make one version
which takes a const char*, and another which takes a const std::string&,
and make them act as efficiently as possible depending on the type of
the parameter. Now you have the best of both worlds. (The simplest
implementation would be for the latter function to simply call the
former function using c_str() on the std::string.)
 
C

Christopher

  Making a function take a const char* rather than a std::string may be
a question of efficiency.

Passing a const char * and a const std::string &, is going to be the
same efficiency wise.
If we don't have a string in the first place, then yes, you are
correct.
However, when I am dealing with text, I am going to do my best to have
a string in the first place. It all depends on where the text
originated.

As you mentioned, being versitile and having two versions of a method,
one using strings and one using const char * is a common and good
solution to make everyone happy.

A better question would be when to use a string and when to use a
const char * when looking at where the text originated. I've had many
an argument with an old C programmer over things they claim to do for
the sake of "efficiency" only to add up the months of debugging time
I've had to spend because of thier choices.

Example:

const char * pointing to harcoded text in another dll that has been
unloaded.


If we were all super programmers and never created a bug, then fine,
nit pick about efficiency. Usually, algorithms what need optimizing,
not a programmers choice of string vs char *. We also need to look at
the context. Are we writing network code that is working on 1000s of
lines of text a second or are we working with an application that is
waiting for input from the user? Are we writing something that is
running at "init time" or are we trying to squeeze every calculation
we can into tiny chunk of time?
 
K

Kei

I see :)

So the reason of passing by reference(or const reference, if not
modifying the variable) is to prevent another copy from being created,
which reduces efficiency.

Returning a std::string is OK, and lots of compilers optimises the
returned string so it doesn't get copied too many times...

char* is used when you just need that char array and when used across
libraries..

By the way, when using ``const char*'', is there a concern of thread
safety? I assume it would be a bad idea while thread A is reading the
char array from a String while thread B writes into it.

Thanks!

Kei
 
J

Juha Nieminen

Christopher said:
However, when I am dealing with text, I am going to do my best to have
a string in the first place.

It's not always possible. For instance, you can't write a std::string
literal, while you can write a const char* literal.
 
B

Bo Persson

Juha said:
It's not always possible. For instance, you can't write a
std::string literal, while you can write a const char* literal.

Which is then implicitly converted to a std::string.

If this is the bottleneck of you application, you have some really odd
things going on.



Bo Persson
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top