<string> to lowercase

Z

Zombie

Hi, what is the correct way of converting contents of a <string> to
lowercase?
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:

-----------------------------
string str = "faLSe";
char* pc_str = NULL;

pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));

strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"
 
M

Matt Wharton

You might look at using the 'for_each' algorithm in conjunction with the
string's iterators and 'tolower'.

-Matt

Zombie said:
Hi, what is the correct way of converting contents of a <string> to
lowercase?
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:

-----------------------------
string str = "faLSe";
char* pc_str = NULL;

pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));

strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"
-----------------------------

Is there any other, less cumbersome way of doing the same?

Thanks for your time.
 
K

Kai-Uwe Bux

Zombie said:
Hi, what is the correct way of converting contents of a <string> to
lowercase?
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:

-----------------------------
string str = "faLSe";
char* pc_str = NULL;

pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));

strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"
-----------------------------

Is there any other, less cumbersome way of doing the same?

Thanks for your time.


The following uses the current locale to convert a string to lowercase:

#include <string>
#include <iostream>
#include <locale>

template < typename Iter >
void range_tolower ( Iter beg, Iter end ) {
for( Iter iter = beg; iter != end; ++iter ) {
*iter = std::tolower( *iter );
}
}

void string_tolower ( std::string & str ) {
range_tolower( str.begin(), str.end() );
}

int main ( void ) {
std::string test ( "Test" );
string_tolower( test );
std::cout << test << std::endl;
}


Best

Kai-Uwe
 
A

Ashes

Hi

You can also use the transform() algorithm:

#include <string>
#include <algorithm>

void ConvertToLowerCase(std::string& str)
{
std::transform(str.begin(),
str.end(),
str.begin(),
tolower);
// You may need to cast the above line to (int(*)(int))
// tolower - this works as is on VC 7.1 but may not work on
// other compilers
}

Regards
Ashley
 
X

Xenos

Zombie said:
Hi, what is the correct way of converting contents of a <string> to
lowercase?
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:

-----------------------------
string str = "faLSe";
char* pc_str = NULL;

pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));

strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"

off the top of my head:

void to_lowercase(std::string&s)
{
for (std::string::iterator i = s.begin(); i != s.end(); ++i)
*i = tolower(*i);
}

OR

string to_lowercase(const std::string& s)
{
std::string t;
for (std::string::const_iterator i = s.begin(); i != s.end(); ++i)
t += tolower(*i);
return t;
}


or you could use a function object with one of the standard library
templates such as for_each or transform.
 
J

Joe C

Zombie said:
Hi, what is the correct way of converting contents of a <string> to
lowercase?
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:

-----------------------------
string str = "faLSe";
char* pc_str = NULL;

pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));

strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"

These 2 little functions change the case of strings. Note that these assume
ASCII and as such are not portable. I've been told that these are the worst
functions ever...but they work for me...
if you want it to change the actual string rather than returning a new
string, just pass a reference to a void function and make the change to the
charecter of the original string.

here they are:


#include <iostream>
#include <string>

using namespace std;

string lcase(string in);
string ucase(string in);

int main(){
string str("A mIxEd CaSe StRiNg 123!@@#");

cout << str << endl
<< lcase(str) << endl
<< ucase(str) << endl;

return 0;
}

string lcase(string in){
string stringout;
for(int i = 0; i < in.size(); ++i)
if(!(in & 128) && ((in & 95) > 64) && ((in & 31) <= 26))
stringout += (in | 32); //turn on the lcase bit
else stringout += in; //character wasn't a letter...dont change
return stringout;
}

string ucase(string in){
string stringout;
for(int i = 0; i < in.size(); ++i)
if(!(in & 128) && ((in & 95) > 64) && ((in & 31) <= 26))
stringout += (in & (223)); //turn off the lcase bit
else stringout += in; //character wasn't a letter...dont change
return stringout;
}
 
B

Brian Stone

Hi, what is the correct way of converting contents of a <string> to
lowercase?
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:

-----------------------------
string str = "faLSe";
char* pc_str = NULL;

pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));

strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"
-----------------------------

Is there any other, less cumbersome way of doing the same?

Thanks for your time.

The easiest way I know is to use the transform() function from the
<algorithm> library. Here's an example of how to apply this to a
string to convert the case...

#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cctype>

using namespace std;

int main ( int argc, char **argv )
{
string A = "TeStInG!";

cout << A << endl; // output: TeStInG!
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):tolower) );
cout << A << endl; // output: testing!
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):toupper) );
cout << A << endl; // output: TESTING!
}

-- Brian Stone
South Dakota School of Mines & Technology
UAV Team Lead Programmer
 
T

Tommy Andreasen

Zombie said:
Hi, what is the correct way of converting contents of a <string> to
lowercase?
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:

-----------------------------
string str = "faLSe";
char* pc_str = NULL;

pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));

strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"
-----------------------------

Is there any other, less cumbersome way of doing the same?

Thanks for your time.

I usually to it like this:

std::transform(str.begin(), str.end(), str.begin(),
std::ptr_fun(std::tolower));

Tommy -
 
O

Old Wolf

Kai-Uwe Bux said:
The following uses the current locale to convert a string to lowercase:

#include <string>
#include <iostream>
#include <locale>

template < typename Iter >
void range_tolower ( Iter beg, Iter end ) {
for( Iter iter = beg; iter != end; ++iter ) {
*iter = std::tolower( *iter );
}
}

Unfortunately, std::tolower requires an argument in the range
0...UCHAR_MAX. So you can go:

*iter = std::tolower( (unsigned char)*iter );

and hope that it gets converted back to char properly afterwards, or:

if (*iter >= 0 && *iter <= UCHAR_MAX)
*iter = std::tolower(*iter);
 
P

Peter Koch Larsen

Zombie said:
Hi, what is the correct way of converting contents of a <string> to
lowercase?

Well... this is actually a rather complicated question. For an explanation
as to why, take a look at the thread "Case insensitive comparison of
std::strings" in comp.lang.c++.moderated. For a "basic" conversion, for_each
and tolower would be okay (perhaps combined with some locale but i am not
familiar with these).
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:

-----------------------------
string str = "faLSe";
char* pc_str = NULL;

pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));

strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"

Yes - that approach surely seems to cumbersome.
Thanks for your time.


Kind regards
Peter
 
K

Kai-Uwe Bux

Old said:
Unfortunately, std::tolower requires an argument in the range
0...UCHAR_MAX. So you can go:

*iter = std::tolower( (unsigned char)*iter );

and hope that it gets converted back to char properly afterwards, or:

if (*iter >= 0 && *iter <= UCHAR_MAX)
*iter = std::tolower(*iter);

I was under the impression that std::tolower, being a template, would be
instantiated for the deduced type <char> when the argument *iter where iter
is a std::string::iterator. Now, if it is a template, why should it be
restricted to 0..UCHAR_MAX, effectively forcing the type to be unsigned
char? That does not seem to make any sense -- of course, this does not
imply it isn't so. In any case, I looked up tolower in the standard and did
not see any hint at UCHAR_MAX. Probably, I was looking at the wrong
section. Could you point me to the source?


Best

Kai-Uwe Bux
 
K

kanze

(e-mail address removed) (Brian Stone) wrote in message
The easiest way I know is to use the transform() function from the
<algorithm> library. Here's an example of how to apply this to a
string to convert the case...
#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cctype>
using namespace std;
int main ( int argc, char **argv )
{
string A = "TeStInG!";
cout << A << endl; // output: TeStInG!
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):tolower) );
cout << A << endl; // output: testing!
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):toupper) );
cout << A << endl; // output: TESTING!
}

Is it just me, or what? There have been a number of postings suggesting
this, either with or without the call to ptr_fun. Now, it has some
obvious and well known problems when it encounters a character encoding
that is negative, and toupper( 'ß' ) doesn't (and cannot) work at all,
but I can understand anglocentric programmers missing this in a quick
response. On the other hand, I have been unable to find a compiler where
this even compiles, in any of the suggested variants, on any system: it
fails to compile (with or without the ptr_fun) with g++ (3.4.0), Sun CC
(5.1) and VC++ (6.0).

In fact, the only variant which compiled (and that got a warning from
Sun CC) is yours, with ::tolower and ::toupper. And you are playing on a
bug in practically every implementation of <cctype>, which exposes
::tolower and ::toupper (rather than only having them available in
std::, as the standard requires).

As far as I know (and ignoring the issues of passing an out of bounds
value to the functions), the correct way to write the call to transform
is something like:

std::transform( str.begin(), str.end(),
str.begin(),
std::ptr_fun( (int (*)( int ))std::tolower ) ) ;

Even better would be something like:

std::transform(
str.begin(), str.end(),
str.begin(),
boost::bind(
std::ptr_fun(
(char (*)( char, std::locale const& ))std::tolower ),
_1,
std::locale() ) ) ;

(Some of the Boost experts should verify this. I still have enough older
compilers to support that I can't actively use Boost, as much as it
would facilitate my code.)

This should at least give defined behavior in every case, even if it
gives the wrong results sometimes.

Of course, the original poster asked for something that wasn't
awkward:).
 
K

kanze

(e-mail address removed) (Brian Stone) wrote in message
(e-mail address removed) (Zombie) wrote in message
Hi, what is the correct way of converting contents of a <string> to
lowercase?
There are no methods of <string> class to do this so I fallback on
strlwr().
But the c_str() method returns a const pointer which cannot be used
with strlwr() as it does the conversion inplace. So, I use the
following logic of copying the contents to a dynamically allocated
char* array and then doing the conversion:
-----------------------------
string str = "faLSe";
char* pc_str = NULL;
pc_str = new char[str.length() + 1];
memset(pc_str, 0, sizeof(pc_str));
strcpy(pc_str, str.c_str());
strlwr(pc_str);
// pc_str now contains "false"
The easiest way I know is to use the transform() function from the
<algorithm> library. Here's an example of how to apply this to a
string to convert the case...
#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cctype>
using namespace std;
int main ( int argc, char **argv )
{
string A = "TeStInG!";
cout << A << endl; // output: TeStInG!
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):tolower) );
cout << A << endl; // output: testing!
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):toupper) );
cout << A << endl; // output: TESTING!
}

1. This isn't guaranteed to compile, for at least two reasons. The
obvious one is that you've forgotten to include <ostream>. The less
obvious one is that any C++ header may include any other C++
headers; if <iostream> includes <locale> (actually
quite likely, since often <iostream> just includes everything in the
iostream section of the library, and both basic_ios and
basic_streambuf need <locale>), then the call to ptr_fun will be
abiguous.

Formally, in fact, I think that the standard guarantees that it
won't compile, since there shouldn't be a tolower nor a toupper in
global namespace. (But I could be wrong about this. I don't really
understand the interactions between "using namespace" and the ::
specifier.) In practice, however, I don't know of a single
implementation which is conformant in this regard.

2. If it compiles, and uses the tolower in <cctype>, then you have
undefined behavior, at least if plain char is signed (as it is on
most systems). Passing a negative value to the tolower function in
<cctype> is undefined behavior.
 
K

kanze

(e-mail address removed) (Old Wolf) wrote in message
Unfortunately, std::tolower requires an argument in the range
0...UCHAR_MAX.

No, it takes a second parameter, a std::locale. E.g.:

*iter = std::tolower( *iter, std::locale() ) ;

At least in a conforming implementation (see below).
So you can go:
*iter = std::tolower( (unsigned char)*iter );

This works on a conforming implementation as well, as long as you
include <clocale> rather than <locale> or <locale.h>. Conforming
implementations are still pretty rare, however, and I've found that
leaving the std:: off and including <locale.h> seems to be about the
only thing that works portably.

And of course, since in this case, you are using the C version of
tolower, you have to ensure that the input is an unsigned char. And, as
you say, hope that the results don't get mangled when you reconvert back
to char -- realistically, the amount of code that mangling them would
break (even though the standard allows it) is so large that no
implementation would dare...

Finally, of course, none of the solutions really work, because there is
no one to one mapping of upper case characters to lower case characters.
and hope that it gets converted back to char properly afterwards, or:
if (*iter >= 0 && *iter <= UCHAR_MAX)
*iter = std::tolower(*iter);

Actually, the only cases such a mapping can make sense is when you are
using pure ASCII, and have no accented characters. So:

assert( *iter >= 0 && *iter <= 127 ) ;

(assuming ASCII, obviously -- this isn't really portable).
 
K

Kevin W.

using namespace std;
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):tolower) );

A question: what does the double-colon mean in this context, and from
which library does the tolower function come?
 
F

Francis Glassborow

Kevin W. said:
A question: what does the double-colon mean in this context, and from
which library does the tolower function come?

As a using directive is in operation, ::tolower() forces the lookup to
be only in the global namespace + any other names injected with using
declarations. This form of disambiguation is one of the few advantages
of using directives.
 
L

llewelly

(e-mail address removed) (Brian Stone) wrote in message
The easiest way I know is to use the transform() function from the
<algorithm> library. Here's an example of how to apply this to a
string to convert the case...
#include <iostream>
#include <string>
#include <algorithm>
#include <functional>
#include <cctype>
using namespace std;
int main ( int argc, char **argv )
{
string A = "TeStInG!";
cout << A << endl; // output: TeStInG!
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):tolower) );
cout << A << endl; // output: testing!
transform ( A.begin(), A.end(), A.begin(), ptr_fun:):toupper) );
cout << A << endl; // output: TESTING!
}
[snip]
In fact, the only variant which compiled (and that got a warning from
Sun CC) is yours, with ::tolower and ::toupper. And you are playing on a
bug in practically every implementation of <cctype>, which exposes
::tolower and ::toupper (rather than only having them available in
std::, as the standard requires).
[snip]

The 'using namespace std;' at global scope makes std::tolower
and std::toupper be availible at global scope. (See 3.4.3.2)

Even without the 'using namespace std', we have 17.4.3.1.3/5:

# Each function signature from the Standard C library declared
# with external linkage is reserved to the implementation for use
# as a function signature with both extern "C" and extern "C++"
# linkage, (168) or as a name of namespace scope in the global
# namespace.
 
K

kanze

As a using directive is in operation, ::tolower() forces the lookup to
be only in the global namespace + any other names injected with using
declarations.

Are you sure?

I ask because this doesn't seem to be the behavior I'm seeing with most
compilers. If I compile exactly the original program, but with an
#include <locale> as well (so that a couple of other tolower are
available too), I still don't get an error about an ambiguous function;
both g++ and Sun CC chose uniquely the tolower in <cctype> (which in
both implementations, is actually in global namespace, instead of in
std:: as the standard requires). Sun CC, of course, does warn that I'm
trying to use an `extern "C"' function in a context which requires an
`extern "C++"' one. (I think that the ::tolower that g++ picks up is
also an `extern "C"'. If so, his code only works because of two
successive compiler errors.)
This form of disambiguation is one of the few advantages of using
directives.

It may be, but if so, it isn't very portable in practice, since not all
compilers implement it correctly:).

And maybe I'm just dumb, but I find that the complexity here is getting
beyond what I can master.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top