Re: Why string's c_str()? [Overloading const char *()]

Discussion in 'C++' started by Öö Tiib, Oct 31, 2013.

  1. Öö Tiib

    Öö Tiib Guest

    On Wednesday, 30 October 2013 21:00:26 UTC+2, DSF wrote:
    > But... I had always wondered why the STL string class uses c_str()
    > instead of overloading const char *().


    Most of novices to C or C++ are confused by tendency of raw arrays to
    transform into raw pointers on most cases of usage. So standard library
    designers decided not to mimic that confusing behavior.

    Generally all implicit conversions are evil and raw pointers are rarely
    needed in modern C++. So why you care?
     
    Öö Tiib, Oct 31, 2013
    #1
    1. Advertising

  2. On 31.10.2013 01:38, Öö Tiib wrote:
    > On Wednesday, 30 October 2013 21:00:26 UTC+2, DSF wrote:
    >> But... I had always wondered why the STL string class uses c_str()
    >> instead of overloading const char *().

    >
    > Most of novices to C or C++ are confused by tendency of raw arrays to
    > transform into raw pointers on most cases of usage. So standard library
    > designers decided not to mimic that confusing behavior.
    >
    > Generally all implicit conversions are evil and raw pointers are rarely
    > needed in modern C++. So why you care?
    >


    I think, for high level programming one would be better off using a more
    purely high level language.

    C++, while supporting abstraction, deals with more low level stuff such
    as calling OS API functions, and doing that very efficiently.

    I find it very annoying both to write and to read all those .c_str()
    explicit conversion operations. It would be okay if it was a rarely
    invoked operation, full of dangers. But it's commonplace and harmless
    (well mostly, but then anything /can/ be dangerous in C++, and one can't
    avoid using integers or whatever).


    Cheers & hth.,

    - Alf
     
    Alf P. Steinbach, Oct 31, 2013
    #2
    1. Advertising

  3. Öö Tiib

    Öö Tiib Guest

    On Thursday, 31 October 2013 05:52:28 UTC+2, Alf P. Steinbach wrote:
    > On 31.10.2013 01:38, Öö Tiib wrote:
    > > On Wednesday, 30 October 2013 21:00:26 UTC+2, DSF wrote:
    > >> But... I had always wondered why the STL string class uses c_str()
    > >> instead of overloading const char *().

    > >
    > > Most of novices to C or C++ are confused by tendency of raw arrays to
    > > transform into raw pointers on most cases of usage. So standard library
    > > designers decided not to mimic that confusing behavior.
    > >
    > > Generally all implicit conversions are evil and raw pointers are rarely
    > > needed in modern C++. So why you care?

    >
    > I think, for high level programming one would be better off using a more
    > purely high level language.


    It is difficult to find any higher and better scalable general purpose language
    (IOW no limits) than C++. High level solutions should be indeed made with
    problem-oriented stuff (like SAP Business Objects) if your problem domain
    has such thing and the problem is within limits of it.

    > C++, while supporting abstraction, deals with more low level stuff such
    > as calling OS API functions, and doing that very efficiently.


    Lets say Windows API CreateFileW. It has 7 parameters and at least 10
    different responsibilities. If we call such monsters raw and unenwrapped
    then the conversions are least of our problems I suppose.
    So we write a particular OS API call only once in wrapper and compiler
    optimises the wrapper call mostly away anyway.

    > I find it very annoying both to write and to read all those .c_str()
    > explicit conversion operations. It would be okay if it was a rarely
    > invoked operation, full of dangers. But it's commonplace and harmless
    > (well mostly, but then anything /can/ be dangerous in C++, and one can't
    > avoid using integers or whatever).


    I see that c_str() used rather rarely. Basically only in interfaces with
    alien languages. Say converting to or wrapping C interface. Interfaces are
    rarely maintained. Where else you see masses of that c_str() used?
     
    Öö Tiib, Oct 31, 2013
    #3
  4. On 31.10.2013 18:26, Öö Tiib wrote:
    > On Thursday, 31 October 2013 05:52:28 UTC+2, Alf P. Steinbach wrote:
    >> On 31.10.2013 01:38, Öö Tiib wrote:
    >>> On Wednesday, 30 October 2013 21:00:26 UTC+2, DSF wrote:
    >>>> But... I had always wondered why the STL string class uses c_str()
    >>>> instead of overloading const char *().
    >>>
    >>> Most of novices to C or C++ are confused by tendency of raw arrays to
    >>> transform into raw pointers on most cases of usage. So standard library
    >>> designers decided not to mimic that confusing behavior.
    >>>
    >>> Generally all implicit conversions are evil and raw pointers are rarely
    >>> needed in modern C++. So why you care?

    >>
    >> I think, for high level programming one would be better off using a more
    >> purely high level language.

    >
    > It is difficult to find any higher and better scalable general purpose language
    > (IOW no limits) than C++.


    For the "better" you would have to define what you mean by that.

    But re higher level general purpose languages, C# and Java come to mind.
    These languages have module support, currently lacking in C++ (I don't
    know the status of Daveed's proposal). I think these languages scale
    rather well, probably better than C++ currently does, due to the lack of
    module support in C++.

    Even higher level than that you have Python, which works nicely with
    C++, but doesn't really scale (as was discovered with YouTube, IIRC).


    > High level solutions should be indeed made with
    > problem-oriented stuff (like SAP Business Objects) if your problem domain
    > has such thing and the problem is within limits of it.
    >
    >> C++, while supporting abstraction, deals with more low level stuff such
    >> as calling OS API functions, and doing that very efficiently.

    >
    > Lets say Windows API CreateFileW. It has 7 parameters and at least 10
    > different responsibilities. If we call such monsters raw and unenwrapped
    > then the conversions are least of our problems I suppose.
    > So we write a particular OS API call only once in wrapper and compiler
    > optimises the wrapper call mostly away anyway.


    One cannot wrap all Windows API functions.

    And it gets worse when you add in further 3rd party libraries, such as
    OpenCV.

    Not to mention the C++ standard library itself... ;-)


    >> I find it very annoying both to write and to read all those .c_str()
    >> explicit conversion operations. It would be okay if it was a rarely
    >> invoked operation, full of dangers. But it's commonplace and harmless
    >> (well mostly, but then anything /can/ be dangerous in C++, and one can't
    >> avoid using integers or whatever).

    >
    > I see that c_str() used rather rarely. Basically only in interfaces with
    > alien languages. Say converting to or wrapping C interface. Interfaces are
    > rarely maintained. Where else you see masses of that c_str() used?


    In C++03 even std::eek:fstream constructors required raw C string pointers
    for the filenames.

    In other words, library functions taking raw C string pointers is a
    widespread practice, so "natural" that the std::eek:fstream design without
    std::string argument constructor was adopted.

    Regarding the rationale for that, it removes a header dependency and it
    generally does not add any conversion, since the conversion generally
    has to be done at some level anyway, but I suspect that often it's done
    simply because programmers have become used to interfaces like that.


    Cheers & hth.,

    - Alf
     
    Alf P. Steinbach, Oct 31, 2013
    #4
  5. Öö Tiib

    Öö Tiib Guest

    On Thursday, 31 October 2013 20:03:12 UTC+2, Alf P. Steinbach wrote:
    > On 31.10.2013 18:26, Öö Tiib wrote:
    > > On Thursday, 31 October 2013 05:52:28 UTC+2, Alf P. Steinbach wrote:
    > >> On 31.10.2013 01:38, Öö Tiib wrote:
    > >>> On Wednesday, 30 October 2013 21:00:26 UTC+2, DSF wrote:
    > >>>> But... I had always wondered why the STL string class uses c_str()
    > >>>> instead of overloading const char *().
    > >>>
    > >>> Most of novices to C or C++ are confused by tendency of raw arrays to
    > >>> transform into raw pointers on most cases of usage. So standard library
    > >>> designers decided not to mimic that confusing behavior.
    > >>>
    > >>> Generally all implicit conversions are evil and raw pointers are rarely
    > >>> needed in modern C++. So why you care?
    > >>
    > >> I think, for high level programming one would be better off using a more
    > >> purely high level language.

    > >
    > > It is difficult to find any higher and better scalable general purpose language
    > > (IOW no limits) than C++.

    >
    > For the "better" you would have to define what you mean by that.


    "Better scalable" in sense that C++ is better doing multithreading and
    multiprocessing and multi-computing and so on. While language does not contain
    anything supporting it the processes start, fork and shut down faster in practice.

    > But re higher level general purpose languages, C# and Java come to mind.
    > These languages have module support, currently lacking in C++ (I don't
    > know the status of Daveed's proposal). I think these languages scale
    > rather well, probably better than C++ currently does, due to the lack of
    > module support in C++.


    That is true defect of our legal system (IOW standard C++). In actual reality
    we have all the dlls, libraries and executables like everybody. Ours work
    even better than modules of others in practice. Every "oh so high" language
    there uses C or C++ modules. It is pity that our standard avoids legalising
    that reality. It however can't be used as argument since if everybody use
    our modules and processes how come we don't have them? ;-)

    > Even higher level than that you have Python, which works nicely with
    > C++, but doesn't really scale (as was discovered with YouTube, IIRC).


    We always have used such sidekick script languages in real projects. Python
    integrates simpler than Lisp and is better readable than Perl. Something that
    does not scale up can not be considered "higher" but more like "servant". ;-)

    > > High level solutions should be indeed made with
    > > problem-oriented stuff (like SAP Business Objects) if your problem domain
    > > has such thing and the problem is within limits of it.
    > >
    > >> C++, while supporting abstraction, deals with more low level stuff such
    > >> as calling OS API functions, and doing that very efficiently.

    > >
    > > Lets say Windows API CreateFileW. It has 7 parameters and at least 10
    > > different responsibilities. If we call such monsters raw and unenwrapped
    > > then the conversions are least of our problems I suppose.
    > > So we write a particular OS API call only once in wrapper and compiler
    > > optimises the wrapper call mostly away anyway.

    >
    > One cannot wrap all Windows API functions.


    One does not /have/ to. In practice it is often advisable not to. For majority of
    common things one has option to take a library that already does it
    (like boost.program_options, boost::filesystem, boost::asio or boost::interprocess).
    If one needs lot of things then there are whole frameworks too (like Qt). So what
    remains are very few "special" calls that the libraries do not wrap but that the
    requirements demand.

    > And it gets worse when you add in further 3rd party libraries, such as
    > OpenCV.


    I have regretfully had no time to mess with OpenCV but my impression was that it
    has C++ API and C 'char*' API of it is deprecated?

    > Not to mention the C++ standard library itself... ;-)


    There are, yes, plenty of things in standard library that are best for nothing.
    Decades long legacy results with stuff like that. Just recently I discovered
    (someone asked in comp.lang.c++.moderated) that some things even are
    described by standard to possibly do nothing whatsoever like:

    std::cin.rdbuf()->pubsetbuf(buffer, sizeof(buffer));

    > >> I find it very annoying both to write and to read all those .c_str()
    > >> explicit conversion operations. It would be okay if it was a rarely
    > >> invoked operation, full of dangers. But it's commonplace and harmless

    >
    > >> (well mostly, but then anything /can/ be dangerous in C++, and one can't
    > >> avoid using integers or whatever).

    > >
    > > I see that c_str() used rather rarely. Basically only in interfaces with
    > > alien languages. Say converting to or wrapping C interface. Interfaces are
    > > rarely maintained. Where else you see masses of that c_str() used?

    >
    > In C++03 even std::eek:fstream constructors required raw C string pointers
    > for the filenames.


    Since contents of those raw byte buffers passed to fstream constructors as
    "file name" are anyway platform specific we typically need something
    (like boost::filesystem) to handle that case in sane manner anyway.

    > In other words, library functions taking raw C string pointers is a
    > widespread practice, so "natural" that the std::eek:fstream design without
    > std::string argument constructor was adopted.
    >
    > Regarding the rationale for that, it removes a header dependency and it
    > generally does not add any conversion, since the conversion generally
    > has to be done at some level anyway, but I suspect that often it's done
    > simply because programmers have become used to interfaces like that.


    I think I/O library is bad example anyway. Bjarne wrote it ages ago; 2 to 4
    other guys "repaired" it and now it is what it is. It gets things done but
    nothing of it is pretty. In reality we have abstracted it (or some other I/O)
    far away under things like "database", "client", "configuration" etc.
     
    Öö Tiib, Nov 1, 2013
    #5
  6. On 03.11.2013 09:33, Paavo Helde wrote:
    > DSF <> wrote in
    > news::
    >>
    >> Why do I care? Because I use raw pointers all the time. I write
    >> Windows code, and 70% of the API calls involve a pointer to a
    >> character string, pointer to a structure, pointer to a buffer, etc.

    >
    > That's because Windows API is defined in terms of C and not C++.
    >
    > In C++ one usually writes wrappers or uses other C++ libraries in order
    > to encapsulate a C API, so the rest of the code can use normal C++ style.
    > In the wrapper code you need pointers and buffers indeed, but this is a
    > localized one-time activity.


    IMHO it's generally a good idea to wrap, but it's not practical to wrap
    everything in the APIs and libraries one uses.

    For high level programming where most everything low level is already
    wrapped up, I would use e.g. C# or Java.


    > Example: encapsulate getcwd():
    >
    > sdt::string My_getcwd() {
    > wchar_t buff[MAX_PATH];
    > DWORD n = ::GetCurrentDirectoryW(MAX_PATH, buff);
    > if (n==0 || n>MAX_PATH) {
    > throw MyException("my_getcwd failed: " + MyGetLastErrorString());
    > }
    > return Win2UtfFileName(std::wstring(buff, n));
    > }


    Demonstrates two needless string copying operations, one needless
    dynamic allocation, introduction of a needless possible failure mode
    (translation) and a choice of representation that makes further
    operations with the string inefficient on this platform, and that even
    makes display of that string impractical for debugging.

    Probably this is all a trade-off for easy cross platform development
    with types dictated by the original platform.

    As such it's not necessarily "wrong", but it sure ain't perfect. ;-)


    > Here, Win2UtfFileName() is another wrapper function wrapping
    > WideCharToMultiByte() on Windows an converting Windows UTF-16 to more
    > portable UTF-8, but that's not the main point here.


    Oh. I think, on the contrary, that it's a pretty important point, as far
    as we're discussing practical programming methodology.

    For, the above fundamental code needlessly MIXES RESPONSIBILITIES.

    Mixing responsibilities can be fine at higher levels (when they have to
    be mixed) and/or when everything works perfectly, but in the above code
    we're down at fundamental level that's invoked by all higher level code,
    and here at bottom there is the silly extra work done, at the cost of
    efficiency and some reliability, to create an unsuitable string
    representation for the platform, at further cost.

    Probably the perceived need to use UTF-8 representation internally in
    the program, is great, and probably most all of the code is based on
    this choice.

    So as a practical matter I advice to (at least at first) merely SEPARATE
    RESPONSIBILITIES, at least in any new wrappers.

    Calling an API function in a safe way with errors translated to
    exceptions, that's one thing. Adapting the function to the existing code
    environment, that's another thing. They are better separate.

    Like, say yes to both.

    One doesn't have to choose one.


    > Third-party libraries like boost::filesystem probably do this better.


    Apparently boost::filesystem was fine in version 2.

    Currently, version 3, it's unable to handle Windows filenames in general
    when used with the g++ compiler (it does handle them OK with Visual C++,
    by using a Visual C++ extension of the standard library).

    Considering the very large effort that has gone into developing and
    quality checking Boost filesystem, this is a good demonstration that
    it's difficult to do "wrappers" right -- and then, relying on the
    ungood wrapper the bugs and limitations are propagated to all the code.
    That's very much worth having in mind as one continues to write
    wrappers. And yes I do write them, all the time, but very carefully.


    > But
    > in any case you should use C++ interfaces in the bulk of your codebase
    > and not struggling with C pointer-and-buffer madness all the time.


    A good ideal to aim for. :)


    Cheers & hth.,

    - Alf
     
    Alf P. Steinbach, Nov 3, 2013
    #6
  7. Öö Tiib

    Öö Tiib Guest

    On Sunday, 3 November 2013 14:22:22 UTC+2, Alf P. Steinbach wrote:
    > On 03.11.2013 09:33, Paavo Helde wrote:
    > > In C++ one usually writes wrappers or uses other C++ libraries in order
    > > to encapsulate a C API, so the rest of the code can use normal C++ style.
    > > In the wrapper code you need pointers and buffers indeed, but this is a>
    > > localized one-time activity.

    >
    > IMHO it's generally a good idea to wrap, but it's not practical to wrap
    > everything in the APIs and libraries one uses.


    It is. If for nothing else then for adding sanity checks and for RAII.

    > For high level programming where most everything low level is already
    > wrapped up, I would use e.g. C# or Java.


    Matter of taste. Note that in C# and Java RAII does not work. So you can
    not encapsulate truly precious resources (with what you apparently deal
    if you discuss exotic parts of Windows APIs) elegantly.

    > > Example: encapsulate getcwd():
    > >
    > > sdt::string My_getcwd() {
    > > wchar_t buff[MAX_PATH];
    > > DWORD n = ::GetCurrentDirectoryW(MAX_PATH, buff);
    > > if (n==0 || n>MAX_PATH) {
    > > throw MyException("my_getcwd failed: " + MyGetLastErrorString());
    > > }
    > > return Win2UtfFileName(std::wstring(buff, n));
    > > }

    >
    > Demonstrates two needless string copying operations, one needless
    > dynamic allocation, introduction of a needless possible failure mode
    > (translation) and a choice of representation that makes further
    > operations with the string inefficient on this platform, and that even
    > makes display of that string impractical for debugging.


    May be. Unless profiler tells that it matters I don't care. But that is
    most mundane part of Windows API so I myself prefer:

    boost::filesystem::path cwd( boost::filesystem::current_path() );

    I do not care if it calls '::GetCurrentDirectoryW()' or what it does.
    I use "high level programming language C++" unless enforced to use "low
    level programming language C++".
     
    Öö Tiib, Nov 3, 2013
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mr. SweatyFinger
    Replies:
    2
    Views:
    2,269
    Smokey Grindel
    Dec 2, 2006
  2. lovecreatesbeauty
    Replies:
    1
    Views:
    1,157
    Ian Collins
    May 9, 2006
  3. grishin
    Replies:
    1
    Views:
    760
    gwowen
    Nov 22, 2010
  4. Alf P. Steinbach
    Replies:
    1
    Views:
    177
    Alf P. Steinbach
    Nov 3, 2013
  5. Tobias Müller
    Replies:
    4
    Views:
    194
    Öö Tiib
    Nov 2, 2013
Loading...

Share This Page