tolower used by transform

Discussion in 'C++' started by qazmlp, Jul 22, 2003.

  1. qazmlp

    qazmlp Guest

    I was using the following code to convert the string to
    lowercase.

    string foo = "Some Mixed Case Text";
    transform(foo.begin(), foo.end(), foo.begin(), tolower);

    I thought the above code is portable.

    But, the following page has a different view on this:
    http://lists.debian.org/debian-gcc/2002/debian-gcc-200204/msg00092.html

    Can anybody comment on it ?
    Also, I would like to know whether tolower template or function will be used
    in the above code.
    transform(foo.begin(), foo.end(), foo.begin(), tolower);
    qazmlp, Jul 22, 2003
    #1
    1. Advertising

  2. (qazmlp) wrote in message news:<>...

    > string foo = "Some Mixed Case Text";
    > transform(foo.begin(), foo.end(), foo.begin(), tolower);
    >
    > I thought the above code is portable.
    >
    > But, the following page has a different view on this:
    > http://lists.debian.org/debian-gcc/2002/debian-gcc-200204/msg00092.html


    Well that's a damn good question!

    The problem is that most implementations of the standard C <ctype.h>
    header define functions like toupper/tolower/etc as macros. To make it
    work in STL algorithms, you have to include <cctype> header instead of
    <ctype.h>. At least on my PC (Debian/gcc 3.3), <cctype> undefines all
    tolower/etc macros and pulls ::tolower/::toupper/etc functions into
    std namespace, so that your sample will work fine.

    However, in general it is recommended to drop old C functions in favor
    of new standard library functionality. In this particular case, you
    may want use the ctype locale facet, i.e.

    #include <locale>

    // ..............

    std::locale loc;
    char s[] = "Test String";
    std::use_facet< std::ctype<char> >( loc ).tolower( s, s + sizeof(s)
    );

    Too bad it does not work for std::string, i.e the following code will
    not compile:

    std::locale loc;
    string s = "Test String";
    std::use_facet< std::ctype<char> >( loc ).tolower( s.begin(),
    s.end() );

    because std::ctype::tolower() definition has only two variants:

    char_type tolower(char_type __c) const;
    const char_type* tolower(char_type* __lo, const char_type* __hi)
    const;

    This leads me to the following piece of code:

    std::transform( s.begin(), s.end(), s.begin(),
    std::bind1st( std::mem_fun( &std::ctype<char>::tolower ),
    &std::use_facet< std::ctype<char> >( loc ) ) );

    Nice, eh? :)

    Now it's truly C++, but I am not sure if I want to use such thing
    instead of good old tolower() from <cctype>. Can anyone suggest a
    better solution?

    PS. <locale> header also defines a standalone std::tolower() function
    that takes locale as a second parameter, but I don't know if it can be
    used in transform, because it is a template/inline function, i.e.
    std::bind2nd and std::ptr_fun do not work with it.

    PPS. It would be REALLY great to hear other opinions on this subject!

    Thanks,
    Sergei.

    --
    Sergei Matusevich,
    Brainbench MVP for C++
    http://www.brainbench.com
    Sergei Matusevich, Jul 23, 2003
    #2
    1. Advertising

  3. qazmlp

    tom_usenet Guest

    On 22 Jul 2003 19:01:15 -0700, (Sergei Matusevich)
    wrote:

    > (qazmlp) wrote in message news:<>...
    >
    >> string foo = "Some Mixed Case Text";
    >> transform(foo.begin(), foo.end(), foo.begin(), tolower);
    >>
    >> I thought the above code is portable.
    >>
    >> But, the following page has a different view on this:
    >> http://lists.debian.org/debian-gcc/2002/debian-gcc-200204/msg00092.html

    >
    >Well that's a damn good question!
    >
    >The problem is that most implementations of the standard C <ctype.h>
    >header define functions like toupper/tolower/etc as macros. To make it
    >work in STL algorithms, you have to include <cctype> header instead of
    ><ctype.h>. At least on my PC (Debian/gcc 3.3), <cctype> undefines all
    >tolower/etc macros and pulls ::tolower/::toupper/etc functions into
    >std namespace, so that your sample will work fine.


    I'm not sure a conforming C++ implementation can have macro versions
    of the ctype.h headers. Most versions I have seen have #ifdef __cpp__
    or similar, using inline functions for the C++ version and macros for
    the C one.

    >However, in general it is recommended to drop old C functions in favor
    >of new standard library functionality. In this particular case, you
    >may want use the ctype locale facet, i.e.
    >
    > #include <locale>
    >
    > // ..............
    >
    > std::locale loc;
    > char s[] = "Test String";
    > std::use_facet< std::ctype<char> >( loc ).tolower( s, s + sizeof(s)
    >);
    >
    >Too bad it does not work for std::string, i.e the following code will
    >not compile:
    >
    > std::locale loc;
    > string s = "Test String";
    > std::use_facet< std::ctype<char> >( loc ).tolower( s.begin(),
    >s.end() );
    >
    >because std::ctype::tolower() definition has only two variants:
    >
    > char_type tolower(char_type __c) const;
    > const char_type* tolower(char_type* __lo, const char_type* __hi)
    >const;
    >
    >This leads me to the following piece of code:
    >
    > std::transform( s.begin(), s.end(), s.begin(),
    > std::bind1st( std::mem_fun( &std::ctype<char>::tolower ),
    > &std::use_facet< std::ctype<char> >( loc ) ) );
    >
    >Nice, eh? :)


    tolower is overloaded so you can't take its address as you are trying
    above, since you don't say which overload you want. You'd need
    something like:

    static_cast<char(std::ctype<char>::*)(char) const>(
    &std::ctype<char>::tolower)

    >
    >Now it's truly C++, but I am not sure if I want to use such thing
    >instead of good old tolower() from <cctype>. Can anyone suggest a
    >better solution?


    Converting a string to lower case can involve changing the length of
    the string in some languages, and a general solution is going to be
    quite complicated and involve complex heuristics. In english though,
    in place modification is of course possible, and it is best to just
    write a couple of functions that operate on strings. There are various
    implementation possibilities, the simplest being an explicit loop.

    >PS. <locale> header also defines a standalone std::tolower() function
    >that takes locale as a second parameter, but I don't know if it can be
    >used in transform, because it is a template/inline function, i.e.
    >std::bind2nd and std::ptr_fun do not work with it.


    Just because it is template and inline doesn't mean bind2nd won't work
    (you just need to cast to choose the correct instantiation). However,
    because it takes the locale argument by reference, it won't work since
    bind2nd will attempt to form a reference to reference argument, which
    is currently illegal.

    Tom
    tom_usenet, Jul 23, 2003
    #3
  4. Thank you Tom for a great posting!

    (tom_usenet) wrote in message news:<>...

    > >> string foo = "Some Mixed Case Text";
    > >> transform(foo.begin(), foo.end(), foo.begin(), tolower);


    [...]

    > > std::transform( s.begin(), s.end(), s.begin(),
    > > std::bind1st( std::mem_fun( &std::ctype<char>::tolower ),
    > > &std::use_facet< std::ctype<char> >( loc ) ) );


    > tolower is overloaded so you can't take its address as you are trying
    > above, since you don't say which overload you want. You'd need
    > something like:
    >
    > static_cast<char(std::ctype<char>::*)(char) const>(
    > &std::ctype<char>::tolower)


    Not sure about other implementations of the standard library, but in
    my gcc 3.3.1 tolower is non-virtual and it delegates all functionality
    to the protected virtual do_tolower method. Therefore, taking address
    of the tolower method is absolutely OK, I'm just not sure about its
    portability [now after your posting :))].

    [...]

    > Converting a string to lower case can involve changing the length of
    > the string in some languages, and a general solution is going to be
    > quite complicated and involve complex heuristics. In english though,
    > in place modification is of course possible, and it is best to just
    > write a couple of functions that operate on strings. There are various
    > implementation possibilities, the simplest being an explicit loop.


    Good point! But then, as far as I understand, tolower from the
    standard library is not a general solution because it's an "one char
    in - one char out" implementation. From the other hand, I don't know
    any locales that may require such a sophisticated case conversion
    procedures.. :)

    But the question remains open - what is the best (generic and
    portable) way to do toupper/tolower conversion for an std::string (or
    std::wstring or any std::basic_string incarnation) in C++? Is there
    any? :))

    Thank you,
    Sergei.

    --
    Sergei Matusevich,
    Brainbench MVP for C++
    http://www.brainbench.com
    Sergei Matusevich, Jul 23, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. David Rubin

    tolower conflict with iostream?

    David Rubin, Oct 8, 2003, in forum: C++
    Replies:
    13
    Views:
    702
    tom_usenet
    Oct 14, 2003
  2. Ahsan

    tolower function in C

    Ahsan, May 8, 2004, in forum: C++
    Replies:
    3
    Views:
    12,703
    John Harrison
    May 8, 2004
  3. TheDD
    Replies:
    11
    Views:
    6,982
    Klaus Eichner
    May 29, 2004
  4. Replies:
    9
    Views:
    456
    Larry I Smith
    May 25, 2005
  5. Eric Lilja
    Replies:
    4
    Views:
    439
    Thierry Miceli
    Sep 2, 2005
Loading...

Share This Page