C to C++ Syntax

B

Brian C

Hello all,
I have a question regarding syntax, coming from a C background. In C,
if I had a function like this:

char *ParseDelimitedString(char *String,int DelimitedNumber,char
*Retval,int MaxBytes)
{
....
return(Retval);
}

This would be the way to return a string, instead of trying to return a
local variable. Of course, you can malloc() within the function and
return that, but I dont like having the caller know he has to free() it,
but I digress..


In C++, if I had this function as a static function, in some StringUtil
class:

std::string &StringUtil::parseDelimitedString(string String,int
Num,string &Retval)
{
....
return(Retval);
}

or

std::string StringUtil::parseDelimitedString(string String,int Num)
{
string LocalVar;

return(LocalVar);
}

From testing, I did see that returning the LocalVar created a 3rd
string (did not call the constructor), and did copy the data into the
string class. So, I assume this works, but about creating a new class,
for performance reasons is my question.

1) Which is correct? 1, 2, both?

2) Which would be faster? I assume #1 because it is just returning a
reference.

Thanks all.
 
V

Victor Bazarov

Brian said:
I have a question regarding syntax, coming from a C background. In C,
if I had a function like this:

char *ParseDelimitedString(char *String,int DelimitedNumber,char
*Retval,int MaxBytes)
{
...
return(Retval);

Why parentheses?
}

This would be the way to return a string, instead of trying to return
a local variable. Of course, you can malloc() within the function and
return that, but I dont like having the caller know he has to free()
it, but I digress..


In C++, if I had this function as a static function, in some
StringUtil class:

std::string &StringUtil::parseDelimitedString(string String,int
Num,string &Retval)
{
...
return(Retval);

That's OK. The user has to be notified that the third argument is
going to be used to return the string.

Passing a string to the function is better by a reference to const.
}

or

std::string StringUtil::parseDelimitedString(string String,int Num)
{
string LocalVar;

return(LocalVar);

That's generally better. (plus the same sentiment about ref to const)
}

From testing, I did see that returning the LocalVar created a 3rd
string (did not call the constructor), and did copy the data into the
string class. So, I assume this works, but about creating a new class,
for performance reasons is my question.

1) Which is correct? 1, 2, both?

Both are "correct".
2) Which would be faster? I assume #1 because it is just returning a
reference.

Could be. Impossible to tell without the context and without actually
measuring.

What books are you reading? Try "Accelerated C++". Things like passing
by reference instead of by value should become second nature before you
can start talking performance.

V
 
B

Brian C

Victor said:
Why parentheses?
Why not? I guess it's a habit of mine, I always do it.
That's OK. The user has to be notified that the third argument is
going to be used to return the string.

Passing a string to the function is better by a reference to const.


That's generally better. (plus the same sentiment about ref to const)


Both are "correct".


Could be. Impossible to tell without the context and without actually
measuring.

What books are you reading? Try "Accelerated C++". Things like passing
by reference instead of by value should become second nature before you
can start talking performance.

V

I typically find books bad, except for reference because they don't
really show real-world examples. I've got a few C++ books. I've got one
from Deitel/Deitel (forget the name, it is at work), Practical C++ &
Teach Yourself C++ in 10 minutes (it lies, it isn't 10 minutes, good
reference book).

I will take a look at Accelerated C++, and also am going to look into
Effective C++. My personal experience with programming languages is that
you can read all you want, but until you actually code a real project in
them, you don't really learn. Guess it's me.
 
J

Jim Langston

Brian C said:
Hello all,
I have a question regarding syntax, coming from a C background. In C, if I
had a function like this:

char *ParseDelimitedString(char *String,int DelimitedNumber,char
*Retval,int MaxBytes)
{
...
return(Retval);
}

This would be the way to return a string, instead of trying to return a
local variable. Of course, you can malloc() within the function and return
that, but I dont like having the caller know he has to free() it, but I
digress..

Of course you are just returning the same char pointer they passed in. The
user could just ignore the return value and use their passed in char
pointer, which is good.
In C++, if I had this function as a static function, in some StringUtil
class:

std::string &StringUtil::parseDelimitedString(string String,int Num,string
&Retval)
{
...
return(Retval);
}

You are basically doing the exact same thing as with the C pointers except
using them as a reference. This would be my choice.
or

std::string StringUtil::parseDelimitedString(string String,int Num)
{
string LocalVar;

return(LocalVar);
}

From testing, I did see that returning the LocalVar created a 3rd string
(did not call the constructor), and did copy the data into the string
class. So, I assume this works, but about creating a new class, for
performance reasons is my question.

This is making a local string and returning it on the stack. Yes, it has to
make a copy for this, and the calling function has to copy it off the stack
to use it. For extremely large strings, it may not even fit on the stack.
Although I do use this method myself for trivial strings (< 50 chars).
1) Which is correct? 1, 2, both?

Both are legal.
2) Which would be faster? I assume #1 because it is just returning a
reference.

Yes. #1 because it doesn't have to copy the contents of the string onto the
stack, nor copy it off again. But the user of the function has to know that
their passed in string will be modified, which they probably would anyway.
Thanks all.

You're welcome.
 
B

Brian C

Jim said:
Of course you are just returning the same char pointer they passed in. The
user could just ignore the return value and use their passed in char
pointer, which is good.


You are basically doing the exact same thing as with the C pointers except
using them as a reference. This would be my choice.


This is making a local string and returning it on the stack. Yes, it has to
make a copy for this, and the calling function has to copy it off the stack
to use it. For extremely large strings, it may not even fit on the stack.
Although I do use this method myself for trivial strings (< 50 chars).


Both are legal.


Yes. #1 because it doesn't have to copy the contents of the string onto the
stack, nor copy it off again. But the user of the function has to know that
their passed in string will be modified, which they probably would anyway.


You're welcome.
Jim,
Thank you for your prompt and informative reply. I assumed both methods
would work, I was just wondering if there was a "preferred" C++ style to
doing so. I assume then it is just a matter of preference, and experience.
I usually look at performance as I have been coding since I had my
Commodore 64 (32KB of free RAM), and work on real-time trading systems
now, so I always bring that experience to my coding, even if not needed.
 
V

Victor Bazarov

Brian said:
[..]
I will take a look at Accelerated C++, and also am going to look into
Effective C++. My personal experience with programming languages is
that you can read all you want, but until you actually code a real
project in them, you don't really learn. Guess it's me.

No, it's not just you. It's generally true. Languages in general
(and not programming ones) require lots practice to get good.

V
 
B

Brian C

Victor said:
Brian said:
[..]
I will take a look at Accelerated C++, and also am going to look into
Effective C++. My personal experience with programming languages is
that you can read all you want, but until you actually code a real
project in them, you don't really learn. Guess it's me.

No, it's not just you. It's generally true. Languages in general
(and not programming ones) require lots practice to get good.

V
I totally agree....and it really applies to things other than languages
.... I learned that after I gutted my bathroom =)
 
B

benben

[snip]
This is making a local string and returning it on the stack. Yes, it has to
make a copy for this, and the calling function has to copy it off the stack
to use it. For extremely large strings, it may not even fit on the stack.
Although I do use this method myself for trivial strings (< 50 chars).

The amount of stack memory to allocate an std::string is always
sizeof(std::string). C++ does require every type to have a constant
size. The content of the string is stored somewhere else.

A lot of std::string implementations use a reference counted internal
structure so character-by-character copying is only done when you have
to change the string content. This makes the copying of string a cheap
operation. And because the local variable gets destroyed before the
return value can be accessed, deep copying can be totally eliminated.
Both are legal.

I would go for the second case because it is more readable. In the first
case the user can use either the return value or the last parameter to
access the result, and that is redundant (what if they are different?)
It is always good to make the interface as simple as possible.
Yes. #1 because it doesn't have to copy the contents of the string onto the
stack, nor copy it off again. But the user of the function has to know that
their passed in string will be modified, which they probably would anyway.

There are two optimizations in C++ that prevents a deep copy in #2.
First was described above. The second is called Named Return Value
optimization. Either of these optimizations will kick in.
You're welcome.

Regards,
Ben
 
I

Ivan Vecerina

: [....] I assumed both methods
: would work, I was just wondering if there was a "preferred" C++ style
to
: doing so. I assume then it is just a matter of preference, and
experience.

If clean style is what you are looking for, you definitely should be
using the direct return syntax:
std::string ParseDelimitedString(std::string const& String, int Num);

Passing an extra parameter only to be used as storage has a name:
premature optimization

In the standard C++ library (the part that was added on top of C's lib),
the systematic approach is to always *return* output-only parameters
(even functions that output two values will return an std::pair).

Why is this better?
Returning the value allows the function to be used as part of a complex
expression. Passing an extra "storage object" parameter is just clutter.
Also, one thing you'll learn about C++ is that "const" is a good thing.
Experienced developers will write:
std::string const& paramA = ParseDelimitedString(commandLine,2);
std::string const& paramB = ParseDelimitedString(commandLine,2);
[NB: the use of a & is optional, although it used to help some compiler
to optimize-out an object copy. In this context, the reference
forces the lifetime of the returned temporary to be extended].

Now for the sad part:
There is an overhead in returning objects. Even though the compiler
is allowed to optimize-out object copies (search for RVO and
NRVO = [named] return value optimization), often the resulting code
will suffer unnecessary object duplications -- although most recent
compilers have become pretty good if all optimizations are enabled.

[[ Regarding benben's comment about strings being reference-counted:
Unfortunately, the most commonly used impementations of std::string
have given up on using reference-counting. They use other tricks
(like embedded storage of short strings, without mem allocation),
but reference counting is "out". ]]

In many cases you should not care about this overhead, but sometimes
it matters (e.g. a functions that returns a big collection of
generated data -- I then often use an output-only by-ref param).

This is recognized as a problem by the C++ standard committee.
In the next C++ standard, this is expected to be addressed by
a new "R-value reference" proposal allowing "moving constructors".

See: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1952.html
[If you are interested in the ongoing evolution of the C++ spec,
see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/ ]

: I usually look at performance as I have been coding since I had my
: Commodore 64 (32KB of free RAM), and work on real-time trading systems
: now, so I always bring that experience to my coding, even if not
needed.
I also keep fond memories of assembly coding on the C64, and work
on embedded, image-processing, and real-time systems.
Yet I try to keep my code lean and clean when performance does not
matter, therefore my advice.

This said:
For parsing a delimited string, a true C++ style would be to either
provide an iterator-like interface, or a single function that
will parse the whole command line and return a vector<string>.
This will also improve performance because you won't re-parse
the whole containing string when accessing the 999's parameter.


I hope this helps,
Ivan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,274
Latest member
JessMcMast

Latest Threads

Top