Best practice: return by value versus const ref

Discussion in 'C++' started by Mikewhy, Oct 5, 2012.

  1. Mikewhy

    Mikewhy Guest

    Apologies. I know this has to have been discussed, but I'm unable to google
    my way past the shallower, newbie discussions of temporary's lifetime, etc.

    A class Lookup contains a collection of strings referenced by some key, the
    values representing, perhaps, a database lookup.. The key is representative
    of its field name. Here, we're interested only in the accessor methods,
    get_val() and get_value(). The first returns a copy of the string by value.
    The other returns a const reference to the value.

    class Lookup
    {
    public:
    enum FieldId {
    ...
    };

    std::string get_val(FieldId) const;

    const std::string & get_value(FieldId) const;
    ...
    };

    The values already exist as std::string in the Lookup object. Returning by
    const reference then is direct and has the advantage of efficiency.
    Returning by value, even with return value optimization, incurs at minimum a
    string copy.

    The question is, assuming that multithread issues are not present, how
    easily can naive code cause trouble for itself with a const reference return
    interface? What usage patterns (that have a chance of surviving the low
    threshold of a peer review) might prove problematic?
     
    Mikewhy, Oct 5, 2012
    #1
    1. Advertising

  2. Mikewhy

    Rui Maciel Guest

    Mikewhy wrote:

    > The question is, assuming that multithread issues are not present, how
    > easily can naive code cause trouble for itself with a const reference
    > return interface? What usage patterns (that have a chance of surviving the
    > low threshold of a peer review) might prove problematic?


    Returning references may cause some significant problems in some
    circumstances, if you can't guarantee that the object will only be accessed
    during its lifetime. If you can't guarantee that then it's quite possible
    that you might end up trying to access an object which doesn't really exist.
    For example, the following code will compile just fine with g++:

    <code>

    #include <iostream>

    struct Foo
    {
    int some_object;

    Foo(int some_value = 0);
    ~Foo();

    int const &getter() const {return some_object; }
    };

    Foo::Foo(int some_value)
    : some_object(some_value)
    {
    std::cout << "Object constructed" << std::endl;
    }

    Foo::~Foo()
    {
    std::cout << "Object destructed. He's dead, Jim." << std::endl;
    }

    int main(void)
    {
    const int *bar = NULL;

    {
    Foo foo;
    bar = &foo.getter();
    std::cout << "accessing object at " << bar << ". value is: "
    << *bar << std::endl;
    }

    std::cout << "accessing object at " << bar << ". value is: " << *bar
    << std::endl;
    return 0;
    }

    </code>


    Rui Maciel
     
    Rui Maciel, Oct 5, 2012
    #2
    1. Advertising

  3. Mikewhy

    Rui Maciel Guest

    Rui Maciel wrote:

    > Returning references may cause some significant problems in some
    > circumstances, if you can't guarantee that the object will only be
    > accessed during its lifetime. If you can't guarantee that then it's quite
    > possible that you might end up trying to access an object which doesn't
    > really exist.


    Here is another example which better demonstrates what I was referring to:

    <code>
    #include <iostream>

    struct Foo
    {
    int some_object;

    Foo(int some_value = 0);
    ~Foo();

    int const &getter() const {return some_object; }
    };

    Foo::Foo(int some_value)
    : some_object(some_value)
    {
    std::cout << "Object constructed" << std::endl;
    }

    Foo::~Foo()
    {
    std::cout << "Object destructed. He's dead, Jim." << std::endl;
    }


    struct Bar
    {
    float some_other_object;

    Bar(float some_value = 5.0f);
    ~Bar();

    float const &getter() const {return some_other_object; }
    };

    Bar::Bar(float some_value)
    : some_other_object(some_value)
    {
    std::cout << "Bar object constructed" << std::endl;
    }

    Bar::~Bar()
    {
    std::cout << "Bar object destructed" << std::endl;
    }


    int main(void)
    {
    char buffer[10] = {0};

    const int *value = NULL;
    const float *other_value = NULL;

    // object is created in a specific address
    Foo *foo = new(buffer) Foo(1);
    value = &foo->getter();
    std::cout << "accessing foo's object at " << value << ". value is: "
    << *value << std::endl;

    // foo's lifetime ends
    foo->~Foo();

    // a new object has been allocated on top of the old one
    Bar *bar = new(buffer) Bar;
    other_value = &bar->getter();
    std::cout << "accessing foo's object at " << value << ". value is: "
    << *value << std::endl;
    std::cout << "accessing bar's object at " << other_value << ". value
    is: " << *other_value << std::endl;
    bar->~Bar();
    return 0;
    }
    </code>


    Rui Maciel
     
    Rui Maciel, Oct 5, 2012
    #3
  4. Hi,

    On 05.10.2012 14:32, Mikewhy wrote:
    > The values already exist as std::string in the Lookup object. Returning
    > by const reference then is direct and has the advantage of efficiency.
    > Returning by value, even with return value optimization, incurs at
    > minimum a string copy.


    taking into account that many (almost all?) std::string implementations
    provide a copy constructor with O(1) together with copy on write, you
    should not care too much about performance here. You will not copy the
    string either. You will copy a reference to the string.


    > The question is, assuming that multithread issues are not present, how
    > easily can naive code cause trouble for itself with a const reference
    > return interface? What usage patterns (that have a chance of surviving
    > the low threshold of a peer review) might prove problematic?


    If you use a reference you should not change the value of the string as
    long as you access the reference. Maybe you also must not make any
    changes to the dictionary, because changing the dictionary content may
    cause reallocations of nodes and destroy the original string.


    Marcel
     
    Marcel Müller, Oct 5, 2012
    #4
  5. Mikewhy

    Nobody Guest

    On Fri, 05 Oct 2012 17:33:44 +0200, Marcel Müller wrote:

    > taking into account that many (almost all?) std::string implementations
    > provide a copy constructor with O(1) together with copy on write,


    Not any more. That used to be common until multi-threading started
    being widely used, at which point the overhead of locking every access
    outweighed the advantages of CoW. It might still be used on platforms
    which have separate single-threaded and multiple-threaded versions of the
    standard library. But in general, you should assume that string copies are
    O(n).
     
    Nobody, Oct 6, 2012
    #5
  6. Mikewhy

    Mikewhy Guest

    "Rui Maciel" wrote in message news:k4msvs$4g8$...

    Mikewhy wrote:

    > The question is, assuming that multithread issues are not present, how
    > easily can naive code cause trouble for itself with a const reference
    > return interface? What usage patterns (that have a chance of surviving the
    > low threshold of a peer review) might prove problematic?



    int main(void)
    {
    const int *bar = NULL;

    {
    Foo foo;
    bar = &foo.getter();
    std::cout << "accessing object at " << bar << ". value is: "
    << *bar << std::endl;
    }

    std::cout << "accessing object at " << bar << ". value is: " << *bar
    << std::endl;
    return 0;
    }

    </code>
    ======================

    That's basically the gist of it. One expects even the most cursory review to
    catch something so blatant, even a few levels deep in function calls or
    object ownership.

    As a short sidebar, before moving on, in your example above, return by value
    would also be equally noteworthy with some older compilers. Just how bad
    depends on which compiler (some being already quite old and will continue to
    be in widespread use for some long time to come). To wit:

    const std::string & aval = alookup.get_val(); // return by value,

    This is (purportedly?) reasonably safe in C++11, but at least one compiler I
    know personally will allow it to fail at runtime with only an ignorable
    warning at compile time.

    It isn't the dunderheaded error exemplified by both of the above that I'm
    after. I'm after reasonable looking usage that despite best efforts and
    intentions of experienced and competent engineers still can fail. When we
    talk of safety in this context, we are talking specifically about relative
    lifetimes of the reference and referencee. Rather than a revelation, this is
    already I expect an ingrained and abiding paranoia in everyone still reading
    this. It is, after all, the basis and reason that shared_ptr and friends
    exist.

    I think I'll conclude here, maybe prematurely, that it is NOT return by
    const ref that is unsound or unsafe. It is careless or inappropriate usage
    or storage that can be problematic. Return by const ref rather than value
    does not of itself make usage errors any more or less likely. If you don't
    know or can't control the lifetime of the referenced, either through
    ownership, explicit guarantee, or reference counts, storing a copy is a
    reasonable alternative. I believe this to apply both in general and in this
    particular discussion.

    Your further thoughts, please....
     
    Mikewhy, Oct 6, 2012
    #6
  7. On 06.10.2012 02:40, Nobody wrote:
    > On Fri, 05 Oct 2012 17:33:44 +0200, Marcel Müller wrote:
    >
    >> taking into account that many (almost all?) std::string implementations
    >> provide a copy constructor with O(1) together with copy on write,

    >
    > Not any more. That used to be common until multi-threading started
    > being widely used, at which point the overhead of locking every access
    > outweighed the advantages of CoW. It might still be used on platforms
    > which have separate single-threaded and multiple-threaded versions of the
    > standard library. But in general, you should assume that string copies are
    > O(n).


    Hmm, that's a pity.
    It confirms my behavior not to use std::string and to prefer other
    immutable string implementations that can be copied by reference.
    Unfortunately something like that is not part of he standard.

    I do not care much about the memcpy for a usually small string content,
    but over and over using allocators might be an impact and is subject to
    some kind of synchronization too - at least at the delete operation.

    In fact, CoW only needs a few synchronization points. Firstly at the
    copy constructor, secondly when a non-unique reference is detached from
    a string instance. Both can be implemented lock free with atomic
    increment/decrement. And if you use an inner and an outer reference
    count, most of the atomic increments at copy construction can also be
    avoided at the cost of some more logic when detaching from references
    with higher outer count.


    Marcel
     
    Marcel Müller, Oct 6, 2012
    #7
  8. On 06.10.2012 05:51, Mikewhy wrote:
    > int main(void)
    > {
    > const int *bar = NULL;
    >
    > {
    > Foo foo;
    > bar = &foo.getter();
    > std::cout << "accessing object at " << bar << ". value
    > is: "
    > << *bar << std::endl;
    > }
    >
    > std::cout << "accessing object at " << bar << ". value is: " <<
    > *bar
    > << std::endl;
    > return 0;
    > }
    >
    > </code>
    > ======================


    AFAIK this is broken code. bar takes a dangling reference. It doesn't
    care whether getter returns by reference or by value.


    > As a short sidebar, before moving on, in your example above, return by
    > value would also be equally noteworthy with some older compilers. Just
    > how bad depends on which compiler (some being already quite old and will
    > continue to be in widespread use for some long time to come). To wit:
    >
    > const std::string & aval = alookup.get_val(); // return by value,
    >
    > This is (purportedly?) reasonably safe in C++11, but at least one
    > compiler I know personally will allow it to fail at runtime with only an
    > ignorable warning at compile time.


    Extending the lifetime of temporaries with const references has been
    included far before C++11. I think it was in the early 90s, but I maybe
    wrong. So unless we are talking about a specific broken compiler, this
    code is fine.


    > I think I'll conclude here, maybe prematurely, that it is NOT return by
    > const ref that is unsound or unsafe. It is careless or inappropriate
    > usage or storage that can be problematic. Return by const ref rather
    > than value does not of itself make usage errors any more or less likely.
    > If you don't know or can't control the lifetime of the referenced,
    > either through ownership, explicit guarantee, or reference counts,
    > storing a copy is a reasonable alternative. I believe this to apply both
    > in general and in this particular discussion.


    Ack.


    Marcel
     
    Marcel Müller, Oct 6, 2012
    #8
  9. Mikewhy

    Mikewhy Guest

    "Marcel Müller" wrote in message
    news:506fd611$0$6554$-online.net...

    On 06.10.2012 05:51, Mikewhy wrote:
    > As a short sidebar, before moving on, in your example above, return by
    > value would also be equally noteworthy with some older compilers. Just
    > how bad depends on which compiler (some being already quite old and will
    > continue to be in widespread use for some long time to come). To wit:
    >
    > const std::string & aval = alookup.get_val(); // return by value,
    >
    > This is (purportedly?) reasonably safe in C++11, but at least one
    > compiler I know personally will allow it to fail at runtime with only an
    > ignorable warning at compile time.


    Extending the lifetime of temporaries with const references has been
    included far before C++11. I think it was in the early 90s, but I maybe
    wrong. So unless we are talking about a specific broken compiler, this
    code is fine.

    ================
    Yes, and I managed to learn something despite myself in these past few
    hours. I attributed it to C++11 because I first came across it in that
    context. Googling around finds that it was in as early as C++03. Lifetime of
    the temporary is extended to that of the const ref that binds to it.

    Which raises a second question, about just how deep that scope can be. For
    the following (example of hazardous code in contest of OP discussion),
    VS2008 issued a compiler warning, and subsequently dumped garbage at run
    time, while g++ was quietly completely boring. (Which was the basis for my
    earlier comment.)

    class Bar
    {
    private:
    const std::string & aval; // ctor from temporary value.
    const std::string & aref; // ctor from reference return.

    public:
    Bar(const Lookup & a)
    : aval(a.get_val()), aref(a.get_value())
    {}
     
    Mikewhy, Oct 6, 2012
    #9
  10. Mikewhy

    Rui Maciel Guest

    Marcel Müller wrote:

    > AFAIK this is broken code. bar takes a dangling reference. It doesn't
    > care whether getter returns by reference or by value.


    That's not true. The bar pointer takes a perfectly valid reference to a
    perfectly valid object. A problem only arises if foo's lifetime somehow
    ends. That was the whole point of the example.

    Conversely, the same example would be broken if struct Foo's getter returned
    by value, as that assignment requires an lvalue. This is caught by any
    compiler.


    Rui Maciel
     
    Rui Maciel, Oct 6, 2012
    #10
  11. Mikewhy

    Pavel Guest

    Nobody wrote:
    > On Fri, 05 Oct 2012 17:33:44 +0200, Marcel Müller wrote:
    >
    >> taking into account that many (almost all?) std::string implementations
    >> provide a copy constructor with O(1) together with copy on write,

    >
    > Not any more. That used to be common until multi-threading started
    > being widely used, at which point the overhead of locking every access
    > outweighed the advantages of CoW. It might still be used on platforms
    > which have separate single-threaded and multiple-threaded versions of the
    > standard library. But in general, you should assume that string copies are
    > O(n).
    >

    Just FYI: GNU STL's std::string is COW, for both single- and multi- threaded
    compilations; and have been for quite a while since multi-threading became common.

    -Pavel
     
    Pavel, Oct 7, 2012
    #11
  12. Mikewhy

    Guest

    On Friday, October 5, 2012 2:32:15 PM UTC+2, Mikewhy wrote:
    > Apologies. I know this has to have been discussed, but I'm unable to google
    >
    > my way past the shallower, newbie discussions of temporary's lifetime, etc.
    >
    >
    >
    > A class Lookup contains a collection of strings referenced by some key, the
    >
    > values representing, perhaps, a database lookup.. The key is representative
    >
    > of its field name. Here, we're interested only in the accessor methods,
    >
    > get_val() and get_value(). The first returns a copy of the string by value.
    >
    > The other returns a const reference to the value.
    >
    >
    >
    > class Lookup
    >
    > {
    >
    > public:
    >
    > enum FieldId {
    >
    > ...
    >
    > };
    >
    >
    >
    > std::string get_val(FieldId) const;
    >
    >
    >
    > const std::string & get_value(FieldId) const;
    >
    > ...
    >
    > };
    >
    >
    >
    > The values already exist as std::string in the Lookup object. Returning by
    >
    > const reference then is direct and has the advantage of efficiency.
    >
    > Returning by value, even with return value optimization, incurs at minimum a
    >
    > string copy.
    >
    >
    >
    > The question is, assuming that multithread issues are not present, how
    >
    > easily can naive code cause trouble for itself with a const reference return
    >
    > interface? What usage patterns (that have a chance of surviving the low
    >
    > threshold of a peer review) might prove problematic?


    Except from what Rui Maciel had shown, I can only think of a subsequent change:

    class v1
    {
    type field;
    public:
    const type& getfield() const { return field; }
    };

    class v2 // v1, but code changed in the meantime
    {
    type calculated() { return type(...); }
    public:
    const type& getfield() const { return calculated(); }
    };

    This will produce at least a diagnostic when compiled ("returning a reference to a temporary" or some such), so well-cranked up warning level is beneficial... ;-)

    At any rate, the core issue is the same here and in the other example: one took a reference to something, and that something ceased to exist.

    As for your question "how easily can naive code?"... Hard to say. It depends very much on people working on the code. The less experience, the more chance there is this will happen and vice verso.

    Goran.

    P.S. random rambler on the 'net once said: "OOP in C++ means "Object Ownership Protocols" ;-).
     
    , Oct 8, 2012
    #12
  13. Mikewhy

    none Guest

    In article <k4mk0e$lg7$>,
    Mikewhy <> wrote:
    >Apologies. I know this has to have been discussed, but I'm unable to google
    >my way past the shallower, newbie discussions of temporary's lifetime, etc.
    >
    >A class Lookup contains a collection of strings referenced by some key, the
    >values representing, perhaps, a database lookup.. The key is representative
    >of its field name. Here, we're interested only in the accessor methods,
    >get_val() and get_value(). The first returns a copy of the string by value.
    >The other returns a const reference to the value.
    >
    >class Lookup
    >{
    >public:
    > enum FieldId {
    > ...
    > };
    >
    > std::string get_val(FieldId) const;
    >
    > const std::string & get_value(FieldId) const;
    > ...
    >};
    >
    >The values already exist as std::string in the Lookup object. Returning by
    >const reference then is direct and has the advantage of efficiency.
    >Returning by value, even with return value optimization, incurs at minimum a
    >string copy.
    >
    >The question is, assuming that multithread issues are not present, how
    >easily can naive code cause trouble for itself with a const reference return
    >interface? What usage patterns (that have a chance of surviving the low
    >threshold of a peer review) might prove problematic?


    My concerns are similar to Goran. You are painting yourself in a
    corner for future refactoring.

    you are stating that the Lookup class does have a member variable for
    every one of the FieldId that can be queried. That is now. However,
    for something like a "Lookup", it would seem plausible that you may
    not forever want to keep a cache of all data in the object. The
    interface you present forces you to do it forever in order to possibly
    save a few CPU cycles (which have not been profiled as being
    critical).

    Note also that a client that uses the interface as follow may gain a
    small performance improvement:

    std::string const & value = aLookup.get_value(FieldId);

    However, a client that attempt to use it as follow:
    (a very common way to write)

    std::string value = aLookup.get_value(FieldId);

    may very much be less performant that if you'd return by value.

    Anyone that want to di into the details of what RVO and Copy Elision
    my mean in optimized code should read through Dave Abrahams article:
    http://cpp-next.com/archive/2009/08/want-speed-pass-by-value/

    and this article is not even C++11 aware.

    Yannick
     
    none, Oct 8, 2012
    #13
  14. Mikewhy

    Mikewhy Guest

    "none (Yannick Tremblay)" wrote in message
    news:k4uesg$8ak$...

    In article <k4mk0e$lg7$>,
    Mikewhy <> wrote:
    >Apologies. I know this has to have been discussed, but I'm unable to google
    >my way past the shallower, newbie discussions of temporary's lifetime, etc.
    >
    >A class Lookup contains a collection of strings referenced by some key, the
    >values representing, perhaps, a database lookup.. The key is representative
    >of its field name. Here, we're interested only in the accessor methods,
    >get_val() and get_value(). The first returns a copy of the string by value.
    >The other returns a const reference to the value.
    >
    >class Lookup
    >{
    >public:
    > enum FieldId {
    > ...
    > };
    >
    > std::string get_val(FieldId) const;
    >
    > const std::string & get_value(FieldId) const;
    > ...
    >};
    >
    >The values already exist as std::string in the Lookup object. Returning by
    >const reference then is direct and has the advantage of efficiency.
    >Returning by value, even with return value optimization, incurs at minimum
    >a
    >string copy.
    >
    >The question is, assuming that multithread issues are not present, how
    >easily can naive code cause trouble for itself with a const reference
    >return
    >interface? What usage patterns (that have a chance of surviving the low
    >threshold of a peer review) might prove problematic?


    My concerns are similar to Goran. You are painting yourself in a
    corner for future refactoring.

    you are stating that the Lookup class does have a member variable for
    every one of the FieldId that can be queried. That is now. However,
    for something like a "Lookup", it would seem plausible that you may
    not forever want to keep a cache of all data in the object. The
    interface you present forces you to do it forever in order to possibly
    save a few CPU cycles (which have not been profiled as being
    critical).

    Note also that a client that uses the interface as follow may gain a
    small performance improvement:

    std::string const & value = aLookup.get_value(FieldId);

    However, a client that attempt to use it as follow:
    (a very common way to write)

    std::string value = aLookup.get_value(FieldId);

    may very much be less performant that if you'd return by value.
    ================

    Sure. Point taken about copy construction versus. It's a slightly different
    context, but certainly one of the differences that I had asked for.

    The actual usage that led to my OP is illustrated by (but not actually):

    if (foo[field1] == "VALUE1" ||
    foo[field1] == "VALUE2 ||
    (foo[field1] == "VALUE3 &&
    foo[field4] == "VALUE4"))
    { ... }
     
    Mikewhy, Oct 9, 2012
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Javier
    Replies:
    2
    Views:
    604
    James Kanze
    Sep 4, 2007
  2. flopbucket
    Replies:
    10
    Views:
    538
  3. Replies:
    0
    Views:
    357
  4. PeteUK
    Replies:
    9
    Views:
    421
    PeteUK
    Mar 12, 2010
  5. Paul Butcher
    Replies:
    12
    Views:
    764
    Gary Wright
    Nov 28, 2007
Loading...

Share This Page