The best, (right?), way to trim a char*

S

Simon

Hi,

I have written a function to trim char *, but I have been told that my way
could be dangerous and that I should use memmove(...) instead.
but I am not sure why my code could be 'dangerous' or even why there could
be a problem.

here is the code

////////
const char* TrimLeft( char *dest)
{
if (!dest ) return dest; //all done
size_t size = 0;
// trim left
while( size >= 0 && ( _istspace( dest[ size]) ||
dest[ size] == 10 ||
dest[ size] == 13))
{
for ( size_t loop = 0; loop < strlen( dest ) -1; loop++ )
dest[ loop] = dest[ loop +1];
dest[ strlen( dest ) -1 ] = '\0';
}
return dest;
}

const char* TrimRight( char *dest)
{
if (!dest ) return dest; //all done
int size = int(strlen( dest ));

// trim right
size--;
while( size >= 0 && ( _istspace( dest[ size]) ||
dest[ size] == 10 ||
dest[ size] == 13))
{
dest[ size] = '\0';
size--;
}
return dest;
}

const char* Trim( char *dest)
{
TrimLeft ( dest );
TrimRight ( dest );
return dest;
}

/////////////////////
// some test
/////////////////////
int main( int argc, char **argv )
{
char a[10+1];
strcpy( a, " 12345678 " );

char * b = new char[10+1];
strcpy( b, " 12345678 " );

Trim(a);
Trim(b);
// clean
delete [] b;

...
return 1;
}

/////////


Also I guess it is not really possible but I'll ask just in case, would
there be a way of trimming the memory allocated?
by that I mean if I do
char *a = new char[1024];
strcpy( a, " a " );
Trim( a);

would 'a' still have 1024 characters allocated to it or could it be
'trimmed' to 2 and free the rest of the memory?


Many thanks

Simon
 
M

Moonlit

Hi,

Simon said:
Hi,

I have written a function to trim char *, but I have been told that my way
could be dangerous and that I should use memmove(...) instead.
but I am not sure why my code could be 'dangerous' or even why there could
be a problem.

here is the code

////////
const char* TrimLeft( char *dest)
{
if (!dest ) return dest; //all done
size_t size = 0;
// trim left
while( size >= 0 && ( _istspace( dest[ size]) ||
dest[ size] == 10 ||
dest[ size] == 13))
{
for ( size_t loop = 0; loop < strlen( dest ) -1; loop++ )
dest[ loop] = dest[ loop +1];
dest[ strlen( dest ) -1 ] = '\0';
}
return dest;
}
Ouch that is a lot of code even in plain C :)

Well since this is a C++ group
#include <string>
#include <algorithm>

using namespace std;

string& TrimRight( string& String )
{
string::size_type Pos = String.find_last_not_of( ' ' ) + 1;
if( Pos < String.size() ) String.erase( Pos );
return String;
}

string& TrimLeft( string& String )
{
string::size_type Pos = String.find_first_not_of( ' ' ) ;
if( Pos < String.size() )
{
copy( String.begin() + Pos, String.end(), String.begin() );
String.erase( String.begin() + String.size() - Pos, String.end() );
}
return String;
}


Regards, Ron AF Greve.

const char* TrimRight( char *dest)
{
if (!dest ) return dest; //all done
int size = int(strlen( dest ));

// trim right
size--;
while( size >= 0 && ( _istspace( dest[ size]) ||
dest[ size] == 10 ||
dest[ size] == 13))
{
dest[ size] = '\0';
size--;
}
return dest;
}

const char* Trim( char *dest)
{
TrimLeft ( dest );
TrimRight ( dest );
return dest;
}

/////////////////////
// some test
/////////////////////
int main( int argc, char **argv )
{
char a[10+1];
strcpy( a, " 12345678 " );

char * b = new char[10+1];
strcpy( b, " 12345678 " );

Trim(a);
Trim(b);
// clean
delete [] b;

...
return 1;
}

/////////


Also I guess it is not really possible but I'll ask just in case, would
there be a way of trimming the memory allocated?
by that I mean if I do
char *a = new char[1024];
strcpy( a, " a " );
Trim( a);

would 'a' still have 1024 characters allocated to it or could it be
'trimmed' to 2 and free the rest of the memory?
No, you could try to new, copy and delete the original but usual the
performance penalty doesn't outweigh a possible but not guarenteed memory
gain.

Regards, Ron AF Greve
 
J

John Harrison

Simon said:
Hi,

I have written a function to trim char *, but I have been told that my way
could be dangerous and that I should use memmove(...) instead.
but I am not sure why my code could be 'dangerous' or even why there could
be a problem.

There is no problem I can see, except that your TrimLeft function is
inefficient. You should count the number of whitespace chars at the
beginning of the string and copy backward once only. The way you are doing
it if there are multiple whitespace chars at the beginning of your string
then you copy backward repeatedly.

The person you are talking is probably garbling advice about when to use
memcpy and when to use memmove, but thats not relevant to your case since
you aren't using memcpy.
here is the code

////////
const char* TrimLeft( char *dest)
{
if (!dest ) return dest; //all done
size_t size = 0;
// trim left
while( size >= 0 && ( _istspace( dest[ size]) ||
dest[ size] == 10 ||
dest[ size] == 13))
{
for ( size_t loop = 0; loop < strlen( dest ) -1; loop++ )
dest[ loop] = dest[ loop +1];
dest[ strlen( dest ) -1 ] = '\0';
}
return dest;
}

const char* TrimRight( char *dest)
{
if (!dest ) return dest; //all done
int size = int(strlen( dest ));

// trim right
size--;
while( size >= 0 && ( _istspace( dest[ size]) ||
dest[ size] == 10 ||
dest[ size] == 13))
{
dest[ size] = '\0';
size--;
}
return dest;
}

const char* Trim( char *dest)
{
TrimLeft ( dest );
TrimRight ( dest );
return dest;
}

/////////////////////
// some test
/////////////////////
int main( int argc, char **argv )
{
char a[10+1];
strcpy( a, " 12345678 " );

char * b = new char[10+1];
strcpy( b, " 12345678 " );

Trim(a);
Trim(b);
// clean
delete [] b;

...
return 1;
}

/////////


Also I guess it is not really possible but I'll ask just in case, would
there be a way of trimming the memory allocated?
by that I mean if I do
char *a = new char[1024];
strcpy( a, " a " );
Trim( a);

would 'a' still have 1024 characters allocated to it or could it be
'trimmed' to 2 and free the rest of the memory?

Use the std::string class instead, easier, safer, more efficient and real
C++. At the moment you are coding C not C++.Get a book on C++ that explains
about the standard C++ library, e.g. 'The C++ Standard Library' by Josuttis.

john
 
X

Xenos

Simon said:
Hi,

I have written a function to trim char *, but I have been told that my way
could be dangerous and that I should use memmove(...) instead.
but I am not sure why my code could be 'dangerous' or even why there could
be a problem.

here is the code

////////
const char* TrimLeft( char *dest)

you take in a char*, but return the same as a const char*. Did you really
mean to do that?

{
if (!dest ) return dest; //all done
size_t size = 0;
// trim left
while( size >= 0 && ( _istspace( dest[ size]) ||
dest[ size] == 10 ||
dest[ size] == 13))
{
for ( size_t loop = 0; loop < strlen( dest ) -1; loop++ )
dest[ loop] = dest[ loop +1];
dest[ strlen( dest ) -1 ] = '\0';
}
return dest;
}

Very inefficient. You have an O(n^3) algorithm (while * for * strlen) for
what should only be a linear one.


how about:

void trim_left(char* s)
{
size_t sz = strlen(s);
char* p = find(s, s + sz, not1(isspace)));
if (p != s && p != s + sz)
memmove(s, p, sz - (p - s) + 1);
}

or if you are just going to return the pointer, just return p instead of
moving the chars.
 
J

JKop

If you take in a char*, the simply set the appropriate char
to '\0', ending the string.

If you take in const char*, allocate an appropriate block
of memory, then copy.


-JKop
 
M

Mabden

Xenos said:
void trim_left(char* s)
{
size_t sz = strlen(s);
char* p = find(s, s + sz, not1(isspace)));
if (p != s && p != s + sz)
memmove(s, p, sz - (p - s) + 1);
}

or if you are just going to return the pointer, just return p instead of
moving the chars.

Cute idea, but that would make the buffer "smaller". If the calling
function expects to have 1024 chars to work with and thinks trim_left("
3 spaces"); has moved everything as far left as possible, then it thinks
it has 1016 chars left in the buffer, but it would really only have
1013.
 
H

Howard

JKop said:
If you take in a char*, the simply set the appropriate char
to '\0', ending the string.

Note: that only works for TrimRight, not for TrimLeft.
If you take in const char*, allocate an appropriate block
of memory, then copy.

The original functions given by the OP had non-const parameters, and
modified the strings in-place, so that's not what was wanted. To trim left
in-place, a memmove should be used (but preferably only once!).

(I wonder why those functions return a value, though, considering the return
value is ignored and the operations are done in-place...?)

-Howard
 
S

Simon

The original functions given by the OP had non-const parameters, and
modified the strings in-place, so that's not what was wanted. To trim left
in-place, a memmove should be used (but preferably only once!).

(I wonder why those functions return a value, though, considering the return
value is ignored and the operations are done in-place...?)

No reason really, I have removed the return values that were not really
needed in the first place.

Simon
 
S

Simon

Hi,
you take in a char*, but return the same as a const char*. Did you really
mean to do that?

Not really, I removed it as I have no use for it.
Could it have caused problems?
Very inefficient. You have an O(n^3) algorithm (while * for * strlen) for
what should only be a linear one.


how about:

void trim_left(char* s)
{
size_t sz = strlen(s);
char* p = find(s, s + sz, not1(isspace)));
if (p != s && p != s + sz)
memmove(s, p, sz - (p - s) + 1);
}

or if you are just going to return the pointer, just return p instead of
moving the chars.

Indeed.

Thanks

Simon
 
S

Simon

Hi,
Ouch that is a lot of code even in plain C :)

Well since this is a C++ group
#include <string>
#include <algorithm>

using namespace std;

string& TrimRight( string& String )
{
string::size_type Pos = String.find_last_not_of( ' ' ) + 1;
if( Pos < String.size() ) String.erase( Pos );
return String;
}

string& TrimLeft( string& String )
{
string::size_type Pos = String.find_first_not_of( ' ' ) ;
if( Pos < String.size() )
{
copy( String.begin() + Pos, String.end(), String.begin() );
String.erase( String.begin() + String.size() - Pos, String.end() );
}
return String;
}


Regards, Ron AF Greve.

I see your point but the function is used all over the place and changing it
to std::string(...) would be quite a challenge.

Simon
 
J

JKop

Okay here we go:

A) Trim off left

B) Trim off right

1) Takes in char*

2) Takes in const char*


[A1] Just return a char* that points to the starting
character

[A2] Just return a const char* that points to the starting
character

[B1] Set the char after the last character in the string to
'\0', returning a char*

[B2] Allocate new memory, copy it over, returning a char*
OR a const char*, whatever tickles your fancy.


-JKop
 
M

Moonlit

Hi,

red floyd said:
Simon wrote:

You need to handle the return value of string::npos in your code.

I just grabbed some old code and probably would have done it different now.

I agree with you that the code is theoratically not 100% correct, I should
have checked for string::npos (I check for a valid Pos instead). However it
is reasonable to assume that size_type is an unsigned type and npos is
static_cast<unsigned type>( -1 ). In theory its wrong in practice it works
and is efficient.

Regards, Ron AF Greve
 
S

Simon

Simon said:
I just grabbed some old code and probably would have done it different now.

I agree with you that the code is theoratically not 100% correct, I should
have checked for string::npos (I check for a valid Pos instead). However it
is reasonable to assume that size_type is an unsigned type and npos is
static_cast<unsigned type>( -1 ). In theory its wrong in practice it works
and is efficient.

Regards, Ron AF Greve

Sorry guys I really do not mean to be rude, but I am lost in my C world
here, what is the "npos" for and what are you guys talking about?
I know i am in a C++ group but i think i missed a message somewhere.

Simon
 
M

Moonlit

Hi,

Simon said:
Sorry guys I really do not mean to be rude, but I am lost in my C world
here, what is the "npos" for and what are you guys talking about?
I know i am in a C++ group but i think i missed a message somewhere.

string::npos

is returned when something can't be found. Sort of "Invalid position" could
be theoratically anything as long as it is not a valid position of course.
In practice it usually is the unsigned equivalent of -1.

So I should have written "Pos != string::npos" and change the function a
bit.


Regards, Ron AF Greve.
 
D

David Hilsee

Moonlit said:
Hi,



string::npos

is returned when something can't be found. Sort of "Invalid position" could
be theoratically anything as long as it is not a valid position of course.
In practice it usually is the unsigned equivalent of -1.

So I should have written "Pos != string::npos" and change the function a
bit.


Regards, Ron AF Greve.

std::basic_string<T>::npos is -1 (see 21.3), and, AFAIK, size_type is
required to be an unsigned integral type (see 20.1.5). This renders many
checks against std::string::npos unnecessary. Moonlit's code seems to be
doing more work than is necessary.
 
M

Moonlit

Hi,

David Hilsee said:
std::basic_string<T>::npos is -1 (see 21.3), and, AFAIK, size_type is
required to be an unsigned integral type (see 20.1.5). This renders many
checks against std::string::npos unnecessary. Moonlit's code seems to be

The complaint was, I didn't check against string::npos :)
You do have to check if something is found like I did. Could you write those
few lines even shorter?

Regards, Ron AF Greve
doing more work than is necessary.

Good you give us an example of shorter code.

(always trying to improve)

Regards, Ron AF Greve
 
D

David Hilsee

"Moonlit" <news moonlit xs4all nl> wrote in message
be

The complaint was, I didn't check against string::npos :)
You do have to check if something is found like I did. Could you write those
few lines even shorter?

Regards, Ron AF Greve


Good you give us an example of shorter code.

(always trying to improve)

Sure. Often times, there is no need to check against npos because the type
is unsigned and the value is -1 (well, which is translated to the largest
value that type can hold). Unsigned types have useful "wrap around"
properties when you perform computations on them. Also, some of
std::string's member functions have nice, expected, and well-defined
behavior when passed npos.

// #include <iostream>, <string>, etc
string& TrimRight( string& String ) {
// When find_last_not_of returns npos, Pos becomes 0
string::size_type Pos = String.find_last_not_of( ' ' ) + 1;
String.erase( Pos );
return String;
}

string& TrimLeft( string& String ) {
string::size_type Pos = String.find_first_not_of( ' ' ) ;
String.erase( 0, Pos );
return String;
}


int main() {
std::string testStrings[] = {
"test", " test2 ", "", " ", " a", " ", " a",
" a", "a", "a ", "a "
};
int numTests = sizeof(testStrings)/sizeof(testStrings[0]);
for ( int i = 0; i < numTests; ++i ) {
std::string orig = testStrings;
std::cout << "'" << orig << "':";
std::cout << " Left trim: '" << TrimLeft(orig) << "'";
orig = testStrings;
std::cout << " Right trim: '" << TrimRight(orig) << "'";
std::cout << std::endl;
}
return 0;
}

Output:
'test': Left trim: 'test' Right trim: 'test'
' test2 ': Left trim: 'test2 ' Right trim: ' test2'
'': Left trim: '' Right trim: ''
' ': Left trim: '' Right trim: ''
' a': Left trim: 'a' Right trim: ' a'
' ': Left trim: '' Right trim: ''
' a': Left trim: 'a' Right trim: ' a'
' a': Left trim: 'a' Right trim: ' a'
'a': Left trim: 'a' Right trim: 'a'
'a ': Left trim: 'a ' Right trim: 'a'
'a ': Left trim: 'a ' Right trim: 'a'

BTW, the TrimLeft function you originally provided didn't handle the empty
strings the way I expected it should. You could have added an additional
check for npos to fix it, but it was best to remove the check against the
size of the string.

Original code's output (with main above):
'test': Left trim: 'test' Right trim: 'test'
' test2 ': Left trim: 'test2 ' Right trim: ' test2'
'': Left trim: '' Right trim: ''
' ': Left trim: ' ' Right trim: ''
' a': Left trim: 'a' Right trim: ' a'
' ': Left trim: ' ' Right trim: ''
' a': Left trim: 'a' Right trim: ' a'
' a': Left trim: 'a' Right trim: ' a'
'a': Left trim: 'a' Right trim: 'a'
'a ': Left trim: 'a ' Right trim: 'a'
'a ': Left trim: 'a ' Right trim: 'a'
 
M

Moonlit

Hi,

David Hilsee said:
"Moonlit" <news moonlit xs4all nl> wrote in message
be

The complaint was, I didn't check against string::npos :)
You do have to check if something is found like I did. Could you write those
few lines even shorter?

Regards, Ron AF Greve


Good you give us an example of shorter code.

(always trying to improve)

Sure. Often times, there is no need to check against npos because the
type
is unsigned and the value is -1 (well, which is translated to the largest
value that type can hold). Unsigned types have useful "wrap around"
properties when you perform computations on them. Also, some of
std::string's member functions have nice, expected, and well-defined
behavior when passed npos.

// #include <iostream>, <string>, etc
string& TrimRight( string& String ) {
// When find_last_not_of returns npos, Pos becomes 0
string::size_type Pos = String.find_last_not_of( ' ' ) + 1;
String.erase( Pos );
return String;
}

string& TrimLeft( string& String ) {
string::size_type Pos = String.find_first_not_of( ' ' ) ;
String.erase( 0, Pos );
return String;
}


int main() {
std::string testStrings[] = {
"test", " test2 ", "", " ", " a", " ", " a",
" a", "a", "a ", "a "
};
int numTests = sizeof(testStrings)/sizeof(testStrings[0]);
for ( int i = 0; i < numTests; ++i ) {
std::string orig = testStrings;
std::cout << "'" << orig << "':";
std::cout << " Left trim: '" << TrimLeft(orig) << "'";
orig = testStrings;
std::cout << " Right trim: '" << TrimRight(orig) << "'";
std::cout << std::endl;
}
return 0;
}

Output:
'test': Left trim: 'test' Right trim: 'test'
' test2 ': Left trim: 'test2 ' Right trim: ' test2'
'': Left trim: '' Right trim: ''
' ': Left trim: '' Right trim: ''
' a': Left trim: 'a' Right trim: ' a'
' ': Left trim: '' Right trim: ''
' a': Left trim: 'a' Right trim: ' a'
' a': Left trim: 'a' Right trim: ' a'
'a': Left trim: 'a' Right trim: 'a'
'a ': Left trim: 'a ' Right trim: 'a'
'a ': Left trim: 'a ' Right trim: 'a'

BTW, the TrimLeft function you originally provided didn't handle the empty
strings the way I expected it should. You could have added an additional
check for npos to fix it, but it was best to remove the check against the
size of the string.

Original code's output (with main above):
'test': Left trim: 'test' Right trim: 'test'
' test2 ': Left trim: 'test2 ' Right trim: ' test2'
'': Left trim: '' Right trim: ''
' ': Left trim: ' ' Right trim: ''
' a': Left trim: 'a' Right trim: ' a'
' ': Left trim: ' ' Right trim: ''
' a': Left trim: 'a' Right trim: ' a'
' a': Left trim: 'a' Right trim: ' a'
'a': Left trim: 'a' Right trim: 'a'
'a ': Left trim: 'a ' Right trim: 'a'
'a ': Left trim: 'a ' Right trim: 'a'

You code is indeed a lot shorter. I didn't know I could just pass npos to
those functions.

Well as they say a day without learning anything, is a day lost :)

Thanks, Ron AF Greve
 
J

Julie

Simon said:
Hi,

I have written a function to trim char *, but I have been told that my way
could be dangerous and that I should use memmove(...) instead.
but I am not sure why my code could be 'dangerous' or even why there could
be a problem.

This doesn't allocate or move memory, but operates on the source string:


char * TrimLeft(char * string)
{
while (string && *string && isspace(*string))
{
++string;
}
return string;
}

char * TrimRight(char * string)
{
char * end = (string && *string) ? &string[strlen(string)-1] : 0;
while (end && end>=string && isspace(*end))
{
*end = '\0';
}
return string;
}

char * Trim(char * string)
{
return TrimLeft(TrimRight(string));
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top