Short string optimization vs. CoW

Discussion in 'C++' started by Juha Nieminen, Feb 14, 2012.

  1. Reading about the so-called short string optimization used in some
    implementation of std::string, many articles out there seem to
    contrast it with the copy-on-write technique, as if they were
    mutually exclusive.

    The short string optimization is a low-level trick where, if the
    string is short enough, it's stored in the std::string object itself
    rather than allocating memory separately for it. (As a std::string has
    as members usually a pointer, an integral indicating the size of the
    string and, usually, another indicating the current capacity, and
    perhaps a few bytes more for good measure, there's plenty of space
    in the object itself to store short strings. For example on a 64-bit
    system you could store a string of up to 22 characters or so in this
    space; half of that in a 32-bit system.)

    The copy-on-write technique, on the other hand, is a way to making
    copying/assigning even large strings efficiently, as a deep copy of
    the string data is done only when the data is modified rather than
    when it's copied. (The advantages and disadvantages of this are
    two-fold. Clearly if you have very large strings which get copied and
    assigned around a lot, but these copies are seldom modified, it will
    be enormously more efficient with CoW. On the other hand, all
    modifying operators become more expensive, especially on a multi-threaded
    environment, where they need locking. Always deep-copying the string
    can be more expensive if the copying is done needlessly, but is more
    efficient with strings that do not get copied around a lot but are
    modified a lot.)

    Anyways, I was wondering why these articles talk like the two
    techniques were mutually exclusive. I don't see why that would be so.
    I don't see why you couldn't implement *both* of them on the same
    std::string class if you so wanted. You could still have the pointer,
    size and capacity members "re-used" for short string optimization,
    *and* if the string is larger (and thus requires separate memory
    allocation) use CoW on that.

    Well, anyways, I suppose that with C++11 the need for CoW has been
    greatly diminished thanks to move constructors. With C++98 it was a
    useful implementation that greatly sped up eg. sorting a vector of
    large strings (or inserting a new string in such a vector), but with
    move constructors that has become even more efficient than it was with
    CoW strings. Of course there are still situations where move constructors
    cannot be used and CoW would increase efficiency...
    Juha Nieminen, Feb 14, 2012
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mathias

    Q: Scheduling in scipy.cow

    Mathias, Dec 28, 2004, in forum: Python
    Fernando Perez
    Dec 29, 2004
  2. U S Contractors Offering Service A Non-profit

    " They say why buy the cOw when the milk is fOr frEE "

    U S Contractors Offering Service A Non-profit, Nov 30, 2006, in forum: C Programming
    U S Contractors Offering Service A Non-profit
    Nov 30, 2006
  3. mosfet
    Chris Thomasson
    May 11, 2007
  4. Light slices + COW

    , May 3, 2008, in forum: Python
  5. Jordan

    Attack a sacred Python Cow

    Jordan, Jul 24, 2008, in forum: Python
    Terry Reedy
    Aug 5, 2008

Share This Page