strlen in a for loop with malloc-ed char*

A

apropo

what is wrong with this code? someone told me there is a BAD practice with
that strlen in the for loop, but i don't get it exactly. Could anyone
explain me in plain english,please?

char *reverse(char *s)
{
int i;
char *r;
if(!s) return NULL;//ERROR
r=calloc(strlen(s)+1,sizeof(char));
if(!r) return NULL;
for(i=0;i<strlen(s);i++)
*(r+i) = *(s+strlen(s)-1-i);
return r;
}
 
?

=?iso-8859-1?q?Nils_O=2E_Sel=E5sdal?=

what is wrong with this code? someone told me there is a BAD practice with
that strlen in the for loop, but i don't get it exactly. Could anyone
explain me in plain english,please?
You calculate the strlen() many times, which really isn't needed
char *reverse(char *s)
{
int i;
char *r; size_t len;
if(!s) return NULL;//ERROR
len = strlen(s);
r=calloc(len+1,sizeof(char));
if(!r) return NULL;
for(i=0;i<len;i++)
*(r+i) = *(s+len-1-i);
 
D

David Resnick

apropo said:
what is wrong with this code? someone told me there is a BAD practice with
that strlen in the for loop, but i don't get it exactly. Could anyone
explain me in plain english,please?

char *reverse(char *s)
{
int i;
char *r;
if(!s) return NULL;//ERROR
r=calloc(strlen(s)+1,sizeof(char));
if(!r) return NULL;
for(i=0;i<strlen(s);i++)
*(r+i) = *(s+strlen(s)-1-i);
return r;
}

Computing something within a loop that is invarient is wasteful.
The length of "s" isn't changed by this function, so you might
consider taking it in as a const char * to reflect that. Anyway,
more normal would be to do:

size_t len = strlen(s);

And then use len where you are using strlen(s). It is possible
that the optimizer might fix this for you, but why ask for
trouble?

The code isn't "wrong", just burns CPU where it need not.
The calloc is also slightly wasteful, in that you are zeroing
out the memory you are about to fill in just to get the nul
termination, you could just have r[len] = '\0'; after mallocing it.
Whatever.

-David
 
D

Default User

David Resnick wrote:

Computing something within a loop that is invarient is wasteful.
The length of "s" isn't changed by this function, so you might
consider taking it in as a const char * to reflect that.


Are some compilers sophisticated enough to be able to detect this
condition and optimize away that strlen() call?



Brian
 
M

Malcolm

apropo said:
for(i=0;i<strlen(s);i++)
*(r+i) = *(s+strlen(s)-1-i);
Imagine that the string s is huge. Then remember how the computer will
implement strlen() (write the function yourself, if you are not sure).

Are you familiar with Big O notation for algorithms? This tells you how many
operations you need for a given size of input. For instance to multiply by
repeated adding is O(N); you need to add the numbers as often as the smaller
is large. To use the algorithm you were taught at primary school is O(log
N), since the numbers of operations you require goes up by the number of
digits. The adding algorithm is OK for numbers up to ten or so, but for
large numbers the primary school way is far superior.

Similarly, your code is OK for fairly short strings, but for long ones you
will slow things down considerably. Can you work out its Big O notation?
 
R

Robert Harris

Default said:
David Resnick wrote:






Are some compilers sophisticated enough to be able to detect this
condition and optimize away that strlen() call?

The compiler doesn't necessarily know what your strlen() does; after all
you can write it yourself!

Robert
 
A

Arthur J. O'Dwyer

The compiler doesn't necessarily know what your strlen() does; after
all you can write it yourself!

Not in standard C. IIRC, attempting to redefine a standard library
function invokes undefined behavior. (And AFAIK it's quite possible that
the above optimization is the /only/ reason such redefinitions were made
undefined.)

-Arthur
 
D

Default User

Robert said:
The compiler doesn't necessarily know what your strlen() does; after
all you can write it yourself!

What do you mean? It knows its signature:

size_t strlen(const char *s)

Therefore it knows that it doesn't change s. It also knows what
strlen() does (as it is a standard library function).



Brian
 
B

Ben Pfaff

Default User said:
What do you mean? It knows its signature:

size_t strlen(const char *s)

Therefore it knows that it doesn't change s.

Not true. A function parameter's const-ness does not allowed the
compiler to assume that the function will not modify the
argument. It is perfectly valid for it to cast away the
const-ness and modify it anyway (as long as the object that it
refers to is not actually defined as const). `const' in function
parameters is a social contract between the caller and the
callee, not a constraint made by the standard.
 
D

Default User

Ben said:
Not true. A function parameter's const-ness does not allowed the
compiler to assume that the function will not modify the
argument. It is perfectly valid for it to cast away the
const-ness and modify it anyway (as long as the object that it
refers to is not actually defined as const). `const' in function
parameters is a social contract between the caller and the
callee, not a constraint made by the standard.

Ok, good point. Looking at n689, it specifies what strlen() does, but
doesn't say that it is forbidden to change the string. It would be a
pretty perverse implementation with a poor QOS.



Brian
 
B

bd

apropo said:
what is wrong with this code? someone told me there is a BAD practice with
that strlen in the for loop, but i don't get it exactly. Could anyone
explain me in plain english,please?

char *reverse(char *s)
{
int i;
char *r;
if(!s) return NULL;//ERROR
r=calloc(strlen(s)+1,sizeof(char));
if(!r) return NULL;
for(i=0;i<strlen(s);i++)

Every time you go through the loop, you get the length of the string again.
Since it's not going to change, this is a huge waste of time (turns an O(n)
algorithm into an O(n^2) !). Save strlen(s) to a local variable once, then
use that instead.
 
T

Thomas Stegen

Default said:
Ok, good point. Looking at n689, it specifies what strlen() does, but
doesn't say that it is forbidden to change the string. It would be a
pretty perverse implementation with a poor QOS.

A standard compliant strlen is not allowed to change the string.
If a function can do whatever it wants as long as the standard
does not specify that it cannot then it is impossible to write a
strictly conforming program that uses any standard functions.

In fact the standard only specifies what operators do, not what
they cannot do! It would in fact be impossible to write any strictly
conforming program at all with the possible exception of

int main(void)
{
return 0;
}

But then again, the standard does not specify completely what return
cannot do.

Relating to the point above. The compiler can optimise out calls to
strlen because it knows the implementation. In general it cannot do this
with functions unless it can deduce that no side effects are taking
place and that is impossible by just looking at the prototype. It is
possible (sometimes) if the compiler has access to the implementation of
a function.
 
R

Ravi Uday

Arthur J. O'Dwyer said:
Not in standard C. IIRC, attempting to redefine a standard library
function invokes undefined behavior. (And AFAIK it's quite possible that
the above optimization is the /only/ reason such redefinitions were made
undefined.)

Ok.. so if i want to use my own 'strlen' (my_strlen) instead of standard
one(or any function defined by standard )can I not do that ?

Something like
#ifdef strlen
#undef strlen
#define strlen my_strlen

Cant afford to change all the places where i have called strlen !

- Ravi
 
?

=?ISO-8859-1?Q?=22Nils_O=2E_Sel=E5sdal=22?=

Thomas said:
Relating to the point above. The compiler can optimise out calls to
strlen because it knows the implementation. In general it cannot do this
with functions unless it can deduce that no side effects are taking
place and that is impossible by just looking at the prototype. It is
possible (sometimes) if the compiler has access to the implementation of
a function.
It would also have to perform a bit heuristic to determin that calls
to strlen could be done fewer times than actual started. This can
quickly become nontrivial.
 
P

pete

Ravi Uday wrote:
Ok.. so if i want to use my own 'strlen'
(my_strlen) instead of standard
one(or any function defined by standard )can I not do that ?

Something like
#ifdef strlen
#undef strlen
#define strlen my_strlen

Cant afford to change all the places where i have called strlen !

That doesn't do anything if strlen isn't implemented as a macro.
It doesn't have to be and I think that strlen usually isn't.

I don't know if redefining standard macros yields undefined behavior.
It seems like a bad idea.
 
M

Michael Mair

Hiho,

Umh, the standard says strlen is a function so you cannot do this.
Apart from that: If you "hide" this nice trick of yours in a
general header file and other people are not aware of that then
they expect strlen (read: my_strlen) to behave like strlen (read:
strlen). If you really want to do something along these lines,
use a macro with a name like STRLEN or even a function pointer
called, for example, StrLenPtr. Thus, your intent becomes clearer.
Do not use names starting with str as these are reserved for
future library extensions.

That doesn't do anything if strlen isn't implemented as a macro.

Unfortunately, if you do it "right", it does:

#include <string.h>

#ifdef strlen
#undef strlen
#endif
#define strlen(STR) my_strlen(STR)

size_t my_strlen(const char *s)
{
return 0;
}

int main (void)
{
strlen("test");

return 0;
}

compiles and a look at the preprocessor output shows that
strlen("test");
becomes
my_strlen("test");

You can have very funny effects if you use my_strlen() in very
rare cases as wrapper for strlen() but #define strlen before
you define my_strlen(). Then, everything works most of the time
but in some cases you have either recursive calls to my_strlen
without end or seemingly arbitrary results or whatever.
Nice trap.

It doesn't have to be and I think that strlen usually isn't.
Yep.

I don't know if redefining standard macros yields undefined behavior.
It seems like a bad idea.

Definitely! I'd even go as far and declare it a Bad Idea...


Cheers
Michael
 
M

Michael Wojcik

What do you mean? It knows its signature:

size_t strlen(const char *s)

Therefore it knows that it doesn't change s.

The special case of standard functions aside, and the case of casting
away constness aside, this still isn't sufficient to optimize away
the call to a function. The function could have side effects besides
altering its parameters.

The *only* reason why an implementation might be able to optimize
away calls to standard functions is because it knows exactly what
they do. That's the only sufficient guarantee. (Note that this
also requires the restriction against redefining them.)

Of course, an implementation could introduce an extension - such as
a pragma - to denote "pure" functions which have no side effects and
make no use of external state and so could be similarly optimized
away when called repeatedly with the same parameters.[1] However,
standard C has no such facility.


1. Note that a function with no side effects must be a function only
of its input. Maintaining any kind of state between calls would be
a side effect. The restriction against using external state is
necessary to exclude things like clock(), which has no side effects
(except indirectly the one of taking non-zero execution time) but
should not be optimized away, as its return value changes over time.

--
Michael Wojcik (e-mail address removed)

The surface of the word "profession" is hard and rough, the inside mixed with
poison. It's this that prevents me crossing over. And what is there on the
other side? Only what people longingly refer to as "the other side".
-- Tawada Yoko (trans. Margaret Mitsutani)
 
R

Randy Howard

David Resnick wrote:




Are some compilers sophisticated enough to be able to detect this
condition and optimize away that strlen() call?

Not any of the majors, including gcc, lcc-win32 and MSVC. See a recent
thread on loop invariants in comp.programming for empirical data based
upon this same question.
 
R

Randy Howard

Not any of the majors, including gcc, lcc-win32 and MSVC. See a recent
thread on loop invariants in comp.programming for empirical data based
upon this same question.

Correction, I didn't mean to imply that lcc is a "major" compiler, it
just happened to come up near the end of the thread and I included it
as a result.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top