strlen in a for loop with malloc-ed char*

R

Ravi Uday

Unfortunately, if you do it "right", it does:

#include <string.h>

#ifdef strlen
#undef strlen
#endif
#define strlen(STR) my_strlen(STR)

But you just said re-defining any standard functions is a UB. !!
So does the 'right' code you posted correct ?
size_t my_strlen(const char *s)
{
return 0;
}

int main (void)
{
strlen("test");

return 0;
}

compiles and a look at the preprocessor output shows that
strlen("test");
becomes
my_strlen("test");


Definitely! I'd even go as far and declare it a Bad Idea...

My requirement is simple.
I am asked to replace all calls to 'strlen' by another user defined function
'my_strlen'
One way is to manually go and replace all of them. Which is a very tedious
(more than 500 entries !!) approach.
Other way is to undefine it at the start of the .c/.h and redifine with
my_strlen !!
Is there a way to accomplish this ?.

Thanks,
- Ravi
 
M

Michael Mair

Hiho,

Ravi said:
But you just said re-defining any standard functions is a UB. !!
So does the 'right' code you posted correct ?

Umh, sorry, sometimes sarcasm and such like do not transport
as well as one would hope.
- I do not expect the
#ifdef strlen
#undef strlen
#endif
sequence to do anything, as strlen() usually is not #define'd
(you snipped that part of my and pete's answer). However, I wanted
to point out to pete and you, how it could be done, _even_ if I
think it terribly _wrong_.
I showed you how to do what you should _never_ do, for the mentioned
reasons (which you snipped as well).
- My code works in this example and works not for the cases I pointed
out...
.... take for example the function body
return strlen(s);
This does not work and leads to infinite recursion.
Just read once again what I have written.
My requirement is simple.
I am asked to replace all calls to 'strlen' by another user defined function
'my_strlen'
One way is to manually go and replace all of them. Which is a very tedious
(more than 500 entries !!) approach.
Other way is to undefine it at the start of the .c/.h and redifine with
my_strlen !!
Is there a way to accomplish this ?.

Well, if you want my opinion: Find and replace all occurrences of
strlen which are a word (or have no a-zA-Z0-9_ at the beginning or end
of them) using sed or the editor of your choice.
Replace them either by a macro name or by my_strlen.
Nearly all good programming editors will permit you to use regular
expressions and handle all opened files or even all files of a type
in a directory in that way. This is a matter of five minutes:
four to try it out in one test document for all cases you can think of,
one to apply it to your source code and save it -- and these are
very generous time estimates.
<OT>
If you use nedit, TextPad or have a UNIX like environment, I will
happily provide you with an easy way to do it. Write me an e-mail.
Otherwise: Editors have manuals and most of them online help.
</OT>
If you work with an editor which cannot do that: Switch to something
useful.


Cheers
Michael
 
P

pete

Michael said:
Hiho,


Definitely! I'd even go as far and declare it a Bad Idea...

I think that it's undefined behavior regardless of whether or not
strlen is implemented as a macro.

N869
7.1.3 Reserved identifiers
[#1] Each header declares or defines all identifiers listed
in its associated subclause, and optionally declares or
defines identifiers listed in its associated future library
directions subclause and identifiers which are always
reserved either for any use or for use as file scope
identifiers.

-- Each macro name in any of the following subclauses
(including the future library directions) is reserved
for use as specified if any of its associated headers
is included; unless explicitly stated otherwise (see
7.1.4).
-- All identifiers with external linkage in any of the
following subclauses (including the future library
directions) are always reserved for use as identifiers
with external linkage.
 
M

Michael Mair

pete said:
Michael said:
Definitely! I'd even go as far and declare it a Bad Idea...

I think that it's undefined behavior regardless of whether or not
strlen is implemented as a macro.

N869
7.1.3 Reserved identifiers
[#1] Each header declares or defines all identifiers listed
in its associated subclause, and optionally declares or
defines identifiers listed in its associated future library
directions subclause and identifiers which are always
reserved either for any use or for use as file scope
identifiers.

-- Each macro name in any of the following subclauses
(including the future library directions) is reserved
for use as specified if any of its associated headers
is included; unless explicitly stated otherwise (see
7.1.4).
-- All identifiers with external linkage in any of the
following subclauses (including the future library
directions) are always reserved for use as identifiers
with external linkage.

Thank you; moreover, the standard has five items in the list
the last of which is IMO more "useful" in this case:

-- Each identifier with file scope listed in any of the following
subclauses (including the future library directions) is
reserved for use as a macro name and as an identifier with
file scope in the same name space if any of its associated
headers is included.


Cheers,
Michael
 
R

Ravi Uday

Well, if you want my opinion: Find and replace all occurrences of
strlen which are a word (or have no a-zA-Z0-9_ at the beginning or end
of them) using sed or the editor of your choice.
Replace them either by a macro name or by my_strlen.
Nearly all good programming editors will permit you to use regular
expressions and handle all opened files or even all files of a type
in a directory in that way. This is a matter of five minutes:
four to try it out in one test document for all cases you can think of,
one to apply it to your source code and save it -- and these are
very generous time estimates.
<OT>
If you use nedit, TextPad or have a UNIX like environment, I will
happily provide you with an easy way to do it. Write me an e-mail.
Otherwise: Editors have manuals and most of them online help.
</OT>

Whats your mail id..I work on UNIX, you could mail me at the addr
in my message.
 
E

Erik Trulsson

Yes, some compilers can do that optimization.
The compiler doesn't necessarily know what your strlen() does; after all
you can write it yourself!

If you have done an '#include <string.h>' previously the compiler is
allowed to assume that it is the standard strlen() which is used, and
the compiler does know what it does.
 
C

Chris Torek

Assuming all the ">" marks survived all the followups...

Yes, some compilers can do that optimization.

It is at least somewhat rare though. There are certainly some
popular compilers (such as gcc) that "know all about strlen", but
to pull a strlen() call out of a loop, they must also prove to
themselves that the memory area being strlen()-ed does not change
during the execution of that loop.
If you have done an '#include <string.h>' previously the compiler is
allowed to assume that it is the standard strlen() which is used, and
the compiler does know what it does.

Indeed, the compiler is allowed (but not required) to know this
even if you have *not* included <string.h>, as long as you have not
also overridden the name. Programmers are not allowed to reuse that
name as an external-linkage identifier.

One of the interesting things that happens with gcc is that strlen()
is expanded inline on the x86, so that if you *try* to write your
own strlen(), it may never get called:

% cat t.c
#include <string.h>
size_t f(const char *s) { return strlen(s); }
% cat u.c
#include <stdlib.h>
#include <unistd.h> /* POSIX ONLY! */
size_t strlen(const char *s) {
const char *t;
write(1, "strlen called\n", 14); /* POSIX */
for (t = s; *t; t++)
continue;
return t - s;
}
% cat main.c
#include <stdio.h>
extern size_t f(const char *);
int main(void) {
size_t t = f("hello");
printf("f returned %d\n", (int)t);
return 0;
}
% cc -O t.c u.c main.c -W -Wall
% ./a.out
f returned 5
%

What happened to the write() call? A peek at the assembly code
for t.c gives the answer: there is no call to strlen(); instead,
there is just a "repnz scasb" instruction sequence.

What happens if you take out the "#include <string.h>"? Well, you
get an error on "size_t", but if you replace it with some other
header that defines size_t, gcc still replaces the strlen() with
a "repnz scasb". Is gcc in error? No: it is allowed to do this.
On the other hand, if you replace the #include line in t.c with
the entire contents of u.c, making the local strlen() "static":

% cat t.c
#include <stdlib.h>
#include <unistd.h> /* POSIX ONLY! */
static size_t strlen(const char *s) {
const char *t;
write(1, "strlen called\n", 14); /* POSIX */
for (t = s; *t; t++)
continue;
return t - s;
}
size_t f(const char *s) { return strlen(s); }
%

then compile main.c and this new t.c:

% cc -O t.c main.c -W -Wall
% ./a.out
strlen called
f returned 5
%

*now* your function gets called. (In fact, gcc will call it even
if you do not make it static, but then the behavior is undefined
-- it works only by luck. Whether this is "good luck" or "bad
luck" is not immediately obvious. :) )
 
R

Ravi Uday

One of the interesting things that happens with gcc is that strlen()
is expanded inline on the x86, so that if you *try* to write your
own strlen(), it may never get called:

% cat t.c
#include <string.h>
size_t f(const char *s) { return strlen(s); }
% cat u.c
#include <stdlib.h>
#include <unistd.h> /* POSIX ONLY! */
size_t strlen(const char *s) {
const char *t;
write(1, "strlen called\n", 14); /* POSIX */
for (t = s; *t; t++)
continue;
return t - s;
}
% cat main.c
#include <stdio.h>
extern size_t f(const char *);
int main(void) {
size_t t = f("hello");
printf("f returned %d\n", (int)t);
return 0;
}
% cc -O t.c u.c main.c -W -Wall
% ./a.out
f returned 5
%

What happened to the write() call? A peek at the assembly code
for t.c gives the answer: there is no call to strlen(); instead,
there is just a "repnz scasb" instruction sequence.

What happens if you take out the "#include <string.h>"? Well, you
get an error on "size_t", but if you replace it with some other
header that defines size_t, gcc still replaces the strlen() with
a "repnz scasb". Is gcc in error? No: it is allowed to do this.
On the other hand, if you replace the #include line in t.c with
the entire contents of u.c, making the local strlen() "static":

% cat t.c
#include <stdlib.h>
#include <unistd.h> /* POSIX ONLY! */
static size_t strlen(const char *s) {
const char *t;
write(1, "strlen called\n", 14); /* POSIX */
for (t = s; *t; t++)
continue;
return t - s;
}
size_t f(const char *s) { return strlen(s); }
%

then compile main.c and this new t.c:

% cc -O t.c main.c -W -Wall
% ./a.out
strlen called
f returned 5
%

*now* your function gets called. (In fact, gcc will call it even
if you do not make it static, but then the behavior is undefined
-- it works only by luck. Whether this is "good luck" or "bad
luck" is not immediately obvious. :) )

Chris,

So if you want to override any standard library function such as
strlen/fprintf etc..
then you just write your own function and make it *static*, and the call is
in the same file then
the job is done is it ?
is it portable !?

- Ravi
 
C

Chris Torek

So if you want to override any standard library function such as
strlen/fprintf etc..
then you just write your own function and make it *static*, and the call is
in the same file then
the job is done is it ?
is it portable !?

It is "done and portable" under particular conditions.

As a general rule, it is a bad idea in the first place -- instead
of re-using the "standard" name (like strlen or fprintf), *change
the name* when you change the meaning. A function named "my_strlen"
or "my_fprintf" tells someone reading and maintaining the code that
this is *not* the C Standard Library version of strlen() or fprintf().
 
F

Flash Gordon

It is "done and portable" under particular conditions.

As a general rule, it is a bad idea in the first place -- instead
of re-using the "standard" name (like strlen or fprintf), *change
the name* when you change the meaning. A function named "my_strlen"
or "my_fprintf" tells someone reading and maintaining the code that
this is *not* the C Standard Library version of strlen() or fprintf().

To the OP (I'm sure Chris knows this), something more meaninful and less
likely to clash with other code than my_ would be better. For instance,
where I work we have a library where everything (not just alternatives
to standard library functions) is prefixed by"ff" which is short for
ForFront, the company which originally wrote the stuff. This includes
wrapper functions ffmalloc, ffcalloc, ffrealloc, fffree and losts of
others. The might conflict with a library be Mr Fred Furnackapan, but
not with everyone who writes their old stuff and uses my_.
 
R

Ravi

It is "done and portable" under particular conditions.

As a general rule, it is a bad idea in the first place -- instead
of re-using the "standard" name (like strlen or fprintf), *change
the name* when you change the meaning. A function named "my_strlen"
or "my_fprintf" tells someone reading and maintaining the code that
this is *not* the C Standard Library version of strlen() or fprintf().


To the OP (I'm sure Chris knows this), something more meaninful and less
likely to clash with other code than my_ would be better. For instance,
where I work we have a library where everything (not just alternatives
to standard library functions) is prefixed by"ff" which is short for
ForFront, the company which originally wrote the stuff. This includes
wrapper functions ffmalloc, ffcalloc, ffrealloc, fffree and losts of
others. The might conflict with a library be Mr Fred Furnackapan, but
not with everyone who writes their old stuff and uses my_.[/QUOTE]

Ya correct, but my problem was: Sometime back i was given an task of
removing all the 'fprintf'(s) in some 5 files and replace it with
'bug_printf' because of some tty problem during intialization.
So i was wondering if we could just re-define (than actually textual
replacing as majority of you have suggested )
the standard 'fprintf' in all file(s) that needs a change and make it
static (as chris suggested) in each of them.

Something like this:
File a.c: L20-

static int fprintf(FILE *stream, const char *format, ...)
{

if (format)
{
/* do stuff ..*/
/* get the variable list of parameters and pass to bug_printf */
return bug_printf (param1, param2, ...);
}
return 0;
}

File a.c: L200 -

//caller would continue to call:
fprintf (stderr, "Error generated %s\n", strerror (errno));

Will this work !

- Ravi
 
M

Michael Mair

Ravi said:
Ya correct, but my problem was: Sometime back i was given an task of
removing all the 'fprintf'(s) in some 5 files and replace it with
'bug_printf' because of some tty problem during intialization.
So i was wondering if we could just re-define (than actually textual
replacing as majority of you have suggested )
the standard 'fprintf' in all file(s) that needs a change and make it
static (as chris suggested) in each of them.

Something like this:
File a.c: L20-

static int fprintf(FILE *stream, const char *format, ...)
{

if (format)
{
/* do stuff ..*/
/* get the variable list of parameters and pass to bug_printf */
return bug_printf (param1, param2, ...);
will not work. get the va_list via va_start and pass it to some
"v" version of bug_printf().
}
return 0;
}

File a.c: L200 -

//caller would continue to call:
fprintf (stderr, "Error generated %s\n", strerror (errno));

Will this work !

Well, we answered you when you first asked this or an equivalent
question and told you why things like that are Bad Ideas. I even
sent you something via PM to help with the replacing the function
calls in question in the source code and got no real feedback.

So, why exactly are you warming up this stuff again if you already
know that it is considered dangerous and will break at the most
inconvenient time?

For the information of the rest: Have a look at the thread where
this
http://groups.google.de/groups?as_q=replace&as_ugroup=comp.lang.c&as_uauthors=ravi&as_qdr=m3
is found. There, it was strlen().


-Michael
 
R

Ravi

> Well, we answered you when you first asked this or an equivalent
question and told you why things like that are Bad Ideas. I even
sent you something via PM to help with the replacing the function
calls in question in the source code and got no real feedback.
True.. your thing didnt work as i use csh and not bash !
So, why exactly are you warming up this stuff again if you already
know that it is considered dangerous and will break at the most
inconvenient time?

For the information of the rest: Have a look at the thread where
this
http://groups.google.de/groups?as_q=replace&as_ugroup=comp.lang.c&as_uauthors=ravi&as_qdr=m3

is found. There, it was strlen().
Mmm.. ok.. thanks.
 
M

Michael Mair

Ravi said:
True.. your thing didnt work as i use csh and not bash !

.... and I wrote you to just start the bash if you do not use it
normally or try using sh. Apart from that: The difference is just
in the handling of the variables but not in the sed regexp which
is the core of the problem.

To find out whether you have the bash/sh, type "which bash"(or which sh)
at the prompt. Start the bash/sh by typing "bash"/"sh". Try my
approach. If it still does not work, write me an E-Mail.
And of course you can just change the thing to work on something
different than strlen.


-Michael
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,598
Members
45,161
Latest member
GertrudeMa
Top