ptr_fun & tolower confusion

S

Soumen

I wanted convert a mixed case string to a lower case one. And I tried
following code:

std::transform(mixedCaseString.begin(), mixedCaseString::end(),
mixedCaseString.begin(), std::ptr_fun(tolower));

Even though I's including cctype and algorithm, I's getting compiler (g
++ 3.3.6) error:

no matching function for call to `ptr_fun(<unknown type>)'

I could resolve this only by using "::tolower" instead of "tolower".
But then I started googling. And it looks to me
this is not safe. And got confused with many types of responses on
similar topic.

Can someone point me what's the **safe (portable), less-cumbersome**
way to change case of an std::string
using std::transform or any other algorithm? Using boost is also
acceptable (but I've not used boost much other
than using shared_ptr and polymorphic_cast) to me.

Regards,
~ Soumen
 
K

Kai-Uwe Bux

Soumen said:
I wanted convert a mixed case string to a lower case one. And I tried
following code:

std::transform(mixedCaseString.begin(), mixedCaseString::end(),
mixedCaseString.begin(), std::ptr_fun(tolower));

Even though I's including cctype and algorithm, I's getting compiler (g
++ 3.3.6) error:

no matching function for call to `ptr_fun(<unknown type>)'

I could resolve this only by using "::tolower" instead of "tolower".
But then I started googling. And it looks to me
this is not safe. And got confused with many types of responses on
similar topic.

Can someone point me what's the **safe (portable), less-cumbersome**
way to change case of an std::string
using std::transform or any other algorithm? Using boost is also
acceptable (but I've not used boost much other
than using shared_ptr and polymorphic_cast) to me.


Slightly modified from the archive:


#include <tr1/memory>
#include <cstdlib>
#include <locale>

template < typename CharT >
class to_lower {

typedef std::ctype< CharT > char_type;

std::tr1::shared_ptr< std::locale > the_loc_ptr;
char_type const * the_type_ptr;

public:

to_lower ( std::locale const & r_loc = std::locale() )
: the_loc_ptr ( new std::locale ( r_loc ) )
, the_type_ptr ( &std::use_facet< char_type >( *the_loc_ptr ) )
{}

CharT operator() ( CharT chr ) const {
return ( the_type_ptr->tolower( chr ) );
}

};


This is to be used with std::transform like so:

std::transform( mixedCaseString.begin(), mixedCaseString::end(),
mixedCaseString.begin(),
to_lower<char>() );

You could also initialize to_lower from a different locale.


Best

Kai-Uwe Bux
 
A

Amal Pillai

std::transform(mixedCaseString.begin(), mixedCaseString::end(),
mixedCaseString.begin(), std::ptr_fun(tolower));

Isn't there a syntax error - it should be a dot instead
of colons.

The following snippet works fine for me with gcc 3.4.6

#include <iostream>
#include <string>
#include <algorithm>
#include <iterator>
#include <cctype>

int main()
{
std::string str("MARY HAD A LITTLE LAMB");

std::transform(str.begin(), str.end(),
str.begin(),
std::ptr_fun(tolower));

std::copy (str.begin(), str.end(),
std::eek:stream_iterator<char>(std::cout));
return 0;
}
 
S

Soumen

Isn't there a syntax error - it should be a dot instead
of colons.

Yes, there's a typo _here_ in the posting. Thanks for pointing. But in
actual code, it's a dot.
Even then I's getting the error. Only ::tolower resolved the error.
 
S

Soumen

Slightly modified from the archive:

#include <tr1/memory>
#include <cstdlib>
#include <locale>

template < typename CharT >
class to_lower {

  typedef std::ctype< CharT > char_type;

  std::tr1::shared_ptr< std::locale > the_loc_ptr;
  char_type const *                   the_type_ptr;

 public:

  to_lower ( std::locale const & r_loc = std::locale() )
    : the_loc_ptr ( new std::locale ( r_loc ) )
    , the_type_ptr ( &std::use_facet< char_type >( *the_loc_ptr ) )
  {}

  CharT operator() ( CharT chr ) const {
    return ( the_type_ptr->tolower( chr ) );
  }

};

This is to be used with std::transform like so:

  std::transform( mixedCaseString.begin(), mixedCaseString::end(),
                  mixedCaseString.begin(),
                  to_lower<char>() );

You could also initialize to_lower from a different locale.

Best

Kai-Uwe Bux

Thanks. Could you please explain a bit about the functor class? I'm
not able to follow std::use_facet and std::locale part.

Regards,
~ Soumen
 
T

tragomaskhalos

Thanks. Could you please explain a bit about the functor class? I'm
not able to follow std::use_facet and std::locale part.

Regards,
~ Soumen- Hide quoted text -

- Show quoted text -

Man, you don't want to know how complicated
this issue is !
Go into Google groups and the comp.lang.c++
archives and search for "tolower kanze" for
enlightenment.
 
K

Kai-Uwe Bux

tragomaskhalos said:
Man, you don't want to know how complicated
this issue is !
Go into Google groups and the comp.lang.c++
archives and search for "tolower kanze" for
enlightenment.

Right. The functor above is the outcome of an exchange on this newsgroup
that I had with James Kanze a while ago.

In a nutshell:

(a) The tolower from cctype does assume that its argument is positive. That
can cause trouble if char happens to be signed. (More precisely, this
tolower takes its argument as an int and the requirement is that the value
is either the value of the macro EOF or representable as an unsigned char.)

(b) The tolower functions offered through locales are templated upon the
character type and will handle negative arguments without running the risk
of undefined behavior.

(c) Extracting the char_type pointer from the locate via use_facet was
suggested by James Kanze to increase performance. Measurement confirmed
that he was right.

(d) The shared_ptr maneuver is necessary to keep the locale object alive in
case the functor gets copied from a temporary that goes out of scope
afterwards.


Best

Kai-Uwe Bux
 
S

Soumen

Right. The functor above is the outcome of an exchange on this newsgroup
that I had with James Kanze a while ago.

In a nutshell:

(a) The tolower from cctype does assume that its argument is positive. That
can cause trouble if char happens to be signed. (More precisely, this
tolower takes its argument as an int and the requirement is that the value
is either the value of the macro EOF or representable as an unsigned char..)

(b) The tolower functions offered through locales are templated upon the
character type and will handle negative arguments without running the risk
of undefined behavior.

(c) Extracting the char_type pointer from the locate via use_facet was
suggested by James Kanze to increase performance. Measurement confirmed
that he was right.

(d) The shared_ptr maneuver is necessary to keep the locale object alive in
case the functor gets copied from a temporary that goes out of scope
afterwards.

Best

Kai-Uwe Bux

Thanks for nice summary.
 
J

James Kanze

Soumen wrote:

Just curious, but...
Slightly modified from the archive:
#include <tr1/memory>
#include <cstdlib>
#include <locale>
template < typename CharT >
class to_lower {
typedef std::ctype< CharT > char_type;
std::tr1::shared_ptr< std::locale > the_loc_ptr;
char_type const * the_type_ptr;

to_lower ( std::locale const & r_loc = std::locale() )
: the_loc_ptr ( new std::locale ( r_loc ) )

Why the new, and the smart pointer? I just use a locale member.
(If it's part of an actual application, I'll often forego
keeping a copy of the locale anyway---most of the applications I
work on don't play around with locales, so I'm generally sure
that the locale I'm using won't go away.)
 
K

Kai-Uwe Bux

James said:
Just curious, but...






Why the new, and the smart pointer? I just use a locale member.
(If it's part of an actual application, I'll often forego
keeping a copy of the locale anyway---most of the applications I
work on don't play around with locales, so I'm generally sure
that the locale I'm using won't go away.)

No particular reason other than history of the code. It started out as an
internal class and was only used in placed where life-time of temporaries
guaranteed that the locale object would not go away. That class had a
locale pointer (or maybe a reference). So when the code was moved into a
different context where life-time guarantees became problematic, the
pointer got replaced by a smart pointer just to solve the life-time issue.
I guess it's mainly psychological: it was a pointer, it became a smart
pointer. That's all.

Probably, a locale member is better. One would not expect algorithms to copy
functors ruthlessly.


Best

Kai-Uwe Bux
 
G

Greg Herlihy

Slightly modified from the archive:

#include <tr1/memory>
#include <cstdlib>
#include <locale>

template < typename CharT >
class to_lower {

typedef std::ctype< CharT > char_type;

std::tr1::shared_ptr< std::locale > the_loc_ptr;
char_type const * the_type_ptr;

public:

to_lower ( std::locale const & r_loc = std::locale() )
: the_loc_ptr ( new std::locale ( r_loc ) )
, the_type_ptr ( &std::use_facet< char_type >( *the_loc_ptr ) )
{}

CharT operator() ( CharT chr ) const {
return ( the_type_ptr->tolower( chr ) );
}

};

TR1's shared_ptr<> class is not nearly as useful in this case as its
bind() routine. In fact, calling TR1's bind() would eliminate the
custom to_lower functor and its attendant complexity.

After all, lowercasing a C++ string seems like it should be a fairly
straightforward task - one that should require only a few lines of
code::

#include <iostream>
#include <string>
#include <algorithm>
#include <locale>

#include <tr1/functional>

using std::locale;
using std::tolower;
using std::tr1::bind;
using std::tr1::placeholders::_1;

int main()
{
std::string s("GrEg");

transform( s.begin(), s.end(), s.begin(),
bind( tolower<char>, _1, locale()));

std::cout << s << "\n";
}

Program Output:

greg
 
J

James Kanze

No particular reason other than history of the code.

OK. The usual reason in real code, in sum.:)
It started out as an internal class and was only used in
placed where life-time of temporaries guaranteed that the
locale object would not go away. That class had a locale
pointer (or maybe a reference). So when the code was moved
into a different context where life-time guarantees became
problematic, the pointer got replaced by a smart pointer just
to solve the life-time issue. I guess it's mainly
psychological: it was a pointer, it became a smart pointer.
That's all.
Probably, a locale member is better. One would not expect
algorithms to copy functors ruthlessly.

Interesting. My version had a similar history, except that in
the early versions, I didn't keep a pointer to the locale at
all; all I needed, after all, was the ctype. So when lifetime
of the locale (which controls the lifetime of the facet, for
those who might not be following us) became an issue, I created
a copy of the locale in the most convenient place; from what I
gather from the standard (although it probably shouldn't be used
as a design document), locales were designed to be copied, at a
more or less reasonable cost.

And, of course, I'm a very strong believer in the idea that if
you don't need arbitrary and explicit lifetime, you shouldn't be
using new.:)

But I don't think it makes a real difference.
 
J

James Kanze

On Jul 4, 2:34 am, Kai-Uwe Bux <[email protected]> wrote:

[...]
TR1's shared_ptr<> class is not nearly as useful in this case
as its bind() routine. In fact, calling TR1's bind() would
eliminate the custom to_lower functor and its attendant
complexity.
After all, lowercasing a C++ string seems like it should be a
fairly straightforward task - one that should require only a
few lines of code::
#include <iostream>
#include <string>
#include <algorithm>
#include <locale>
#include <tr1/functional>
using std::locale;
using std::tolower;
using std::tr1::bind;
using std::tr1::placeholders::_1;
int main()
{
std::string s("GrEg");
transform( s.begin(), s.end(), s.begin(),
bind( tolower<char>, _1, locale()));
std::cout << s << "\n";
}

That is, of course, the simplest solution. It hasn't been
available all that long, however, and most of us developed our
solution before bind was available. (The shared_ptr isn't
really necessary here, and even if it was, most of us had simple
implementations of shared_ptr long before it made it into TR1.)

And IMHO, there's nothing wrong with providing a general wrapped
tool (although it does lead to the mistaken belief that you can
generally use tranform for converting to lower case---in
practice, the mapping isn't one to one). And using the ctype
directly will probably be slightly faster (although I doubt that
that is an issue).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,572
Members
45,045
Latest member
DRCM

Latest Threads

Top