Re: Why string's c_str()? [Overloading const char *()]

Discussion in 'C++' started by Tobias Müller, Oct 31, 2013.

  1. DSF <> wrote:

    [...]

    > I've reposted a small section here. (I hope they don't mind.)
    >
    > //start


    [...]

    > 01 String a, b;
    > 02 if (a==B) {}
    > 03 if (a=="str") {}
    > 04 if ("str"==a) {}
    > 05 if ("str"!=a) {}
    > 06 if (a > B) {}
    > 07 if (a <= B) {}
    > 08 if ("str" < a) {}
    > 09 *a;
    > 10 (a + 5);
    >
    > All these things will suddenly just compile, but none of these will do
    > what you expect from a String class.


    [...]

    > The examples above along with "none of these will do what you expect
    > from a String class" aroused my curiosity. Except for 09 and 10, I
    > was pretty certain the rest would do what I expect them to.
    >
    > So I compiled the above with my string class, adding the harmless
    > getch() (wait for a keypress) within all of the {} so the whole thing
    > wouldn't be optimized away and to test the result of the if
    > statements. I also ran it with string 'a' initialized to "str".
    >
    > Every single one of the first 8 did exactly what I expected.


    If your class defines all those operators, everything should be fine. And
    IMO a string class _without_ those operators is incomplete.
    But in the case it doesn't, all those operators will work on raw pointers.

    So in fact those examples are a bit unfortunate, because most of the
    expressions are usually considered valid and meaningful for strings.

    [...]

    > So what is the "danger" of overloading * (or in the case of above *
    > and [])? Everything I've written so far has worked as I expected, and
    > it's very convenient and looks more elegant than the alternatives when
    > one needs to pass a const char * to an API call, etc.


    The real dangers lie in expressions that are usually not considered valid
    for strings or that are controversial.

    Most dangerous is IMO the implicit conversion from const char* to bool:
    MyString a;
    if (a) // always true
    {...}

    This one is easy to spot, but it could also be a more complex boolean
    expression with a subtle error.

    Tobi
    Tobias Müller, Oct 31, 2013
    #1
    1. Advertising

  2. DSF <> wrote:
    > On Thu, 31 Oct 2013 06:47:58 +0000 (UTC), Tobias Müller
    > <> wrote:


    [...]

    >> The real dangers lie in expressions that are usually not considered valid
    >> for strings or that are controversial.
    >>
    >> Most dangerous is IMO the implicit conversion from const char* to bool:
    >> MyString a;
    >> if (a) // always true
    >> {...}
    >>
    >> This one is easy to spot, but it could also be a more complex boolean
    >> expression with a subtle error.
    >>
    >> Tobi

    >
    > I'm not quite sure I understand the danger here. Is it that someone
    > is testing for a NULL char pointer before acting on 'a' in an original
    > design that's now it's been converted to a string object and may*
    > always return true?


    I've seen string classes with operator bool() defined as test for non-empty
    strings. I wouldn't consider it good design though.

    The other thing is just typos. If you construct a boolean expression and
    forget e.g. to actually invoke your is_empty() method the compiler will not
    complain.

    Anyway, if I'd write a string class from scratch it would probably also
    have an operator const char*().
    I tend to write my functions such that they take the most general type as
    parameter and return the most specific type. In case of strings const char*
    seems to be most general, you can use the function with most existing
    string classes.

    > *It is possible to design a string class that would return a null
    > pointer under conditions such as being uninitialized or containing
    > only a zero (an empty string), but I didn't take that route.


    IMO that's even worse than defining operator bool(). That would mean you
    couldn't get an empty C string from your string class and you would have to
    insert checks everywhere.

    [...]

    Tobi
    Tobias Müller, Nov 2, 2013
    #2
    1. Advertising

  3. Tobias Müller

    Öö Tiib Guest

    On Saturday, 2 November 2013 13:57:24 UTC+2, Tobias Müller wrote:
    > DSF <> wrote:
    > > On Thu, 31 Oct 2013 06:47:58 +0000 (UTC), Tobias Müller
    > > <> wrote:

    >
    > [...]
    >
    > >> The real dangers lie in expressions that are usually not considered valid
    > >> for strings or that are controversial.
    > >>
    > >> Most dangerous is IMO the implicit conversion from const char* to bool:
    > >> MyString a;
    > >> if (a) // always true
    > >> {...}
    > >>
    > >> This one is easy to spot, but it could also be a more complex boolean
    > >> expression with a subtle error.

    > >
    > > I'm not quite sure I understand the danger here. Is it that someone
    > > is testing for a NULL char pointer before acting on 'a' in an original
    > > design that's now it's been converted to a string object and may*
    > > always return true?

    >
    > I've seen string classes with operator bool() defined as test for non-empty
    > strings. I wouldn't consider it good design though.


    Yes, besides there are often difference between empty and missing data so
    that practice of C++ to silently convert to bool is confusing as rule.
    Unintuitive:

    float x = get_from_somewhere();
    if ( x ) // Q: Does it check for 0 or for NaN or both? A: RTFM.
    {
    // ...
    }

    > The other thing is just typos. If you construct a boolean expression and
    > forget e.g. to actually invoke your is_empty() method the compiler will not
    > complain.


    Yes. While most compilers can be set to complain about inbuilt implicit
    conversions it usually treats implicit conversions of library as "user-made"
    so to get warnings about those one needs usually to make a tool himself.

    > Anyway, if I'd write a string class from scratch it would probably also
    > have an operator const char*().


    Why? If you would write your own 'vector' would you make it implicitly transform
    into its 'begin()' iterator?

    > I tend to write my functions such that they take the most general type as
    > parameter and return the most specific type. In case of strings const char*
    > seems to be most general, you can use the function with most existing
    > string classes.


    Most general type? 'boost::any' or 'void*' if possible? 'char*' is inefficient and
    buggy string. It loses the length despite all operations with string perform
    better when knowing length of it ahead. Therefore in most code bases
    there are several places where 'std::string const&' performs twice better
    than 'char const*'.
    Öö Tiib, Nov 2, 2013
    #3
  4. Tobias Müller

    Öö Tiib Guest

    On Saturday, 2 November 2013 16:52:42 UTC+2, Paavo Helde wrote:
    > Öö Tiib <> wrote in
    > news::
    > > On Saturday, 2 November 2013 13:57:24 UTC+2, Tobias Müller wrote:
    > >
    > >> I tend to write my functions such that they take the most general
    > >> type as parameter and return the most specific type. In case of
    > >> strings const cha

    > > r*
    > >> seems to be most general, you can use the function with most existing
    > >> string classes.

    > >
    > > Most general type? 'boost::any' or 'void*' if possible? 'char*' is
    > > inefficient and buggy string. It loses the length despite all
    > > operations with string perform better when knowing length of it ahead.
    > > Therefore in most code bases there are several places where
    > > 'std::string const&' performs twice better than 'char const*'.

    >
    > One can pass the string length as another argument. However, this would
    > make the call more verbose, which kind of contradicts the main motivation
    > of providing an automatic conversion operator, and it also makes chaining
    > of calls impossible, causing more verbosity again.


    Yes, splitting a fully functional object into number of parameters is usually
    bad idea.

    > On top of that, if one wants to implement the function contents in C++
    > (as opposed to C), then one has to immediately reconstruct std::string or
    > some other C++ object from the const char* pointer, which potentialy
    > involves a dynamic memory allocation operator and content copy, resulting
    > in even larger performance hits than strlen().


    Fortunately not everything in C++ has such hits (say Boost.Range).
    Unfortunately there are no range-based string abstractions.
    Even if there were then refactoring would be painful because 'std::string's
    interface is too position-based (in contrast to iterator-based).

    > So, if the desire is that the function works with a broad range of C++
    > string classes, then instead of falling back to char* pointers I would
    > suggest to use templates instead, assuming std::string-compatible
    > interface. This way the function can even return the same string type
    > which is passed in, which is certainly much more convenient for the
    > caller than dealing with the "most specific type" hardcoded by the
    > function.


    Agreed. Also I feel that 'namespace_of_T::begin( T )' and
    'namespace_of_T::end( T )' that return random access iterators and
    T's constructor from pair of input iterators is perhaps plentiful
    for "std::string-compatible interface" on most of the cases. 'std::string'
    has pointlessly large interface to mimic it.
    Öö Tiib, Nov 2, 2013
    #4
  5. Tobias Müller

    Öö Tiib Guest

    On Saturday, 2 November 2013 21:05:28 UTC+2, Paavo Helde wrote:
    > Öö Tiib <> wrote in
    > news::
    > > 'std::string' has pointlessly large interface to mimic it.

    >
    > I have no problems with the large interface of std::string. There are some
    > convenience functions like find_(first|last)(_not|)_of() which I use quite
    > often.


    Why to cut out of context? I described bare minimum to mimic for passing
    to template interface and then said that above. I did nowhere mean that
    you should not use what you need from interface of std::string (or even
    bigger interface of QString for example) in your own code.

    > And mimicking std::string interface is quite easy when using
    > specializations of std::basic_string for another char type, traits or
    > allocator.


    That is still the std::basic_string. In reality the potential user of your module's
    interface might have some 'QString', CString, wxString, NSString or some
    self-made original_posters::string in her hands. Requiring adaptor from
    her that makes it to have matching interface with std::string is sort of asking
    for trouble I suppose because std::basic_string has lengthy interface.
    Öö Tiib, Nov 2, 2013
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mr. SweatyFinger
    Replies:
    2
    Views:
    1,840
    Smokey Grindel
    Dec 2, 2006
  2. lovecreatesbeauty
    Replies:
    1
    Views:
    1,043
    Ian Collins
    May 9, 2006
  3. grishin
    Replies:
    1
    Views:
    716
    gwowen
    Nov 22, 2010
  4. Öö Tiib
    Replies:
    6
    Views:
    160
    Öö Tiib
    Nov 3, 2013
  5. Alf P. Steinbach
    Replies:
    1
    Views:
    160
    Alf P. Steinbach
    Nov 3, 2013
Loading...

Share This Page