my search and replace function

P

pembed2003

Hi all,
I need to write a function to search and replace part of a char*
passed in to the function. I came up with the following:

char* search_and_replace(char* source,char search,char* replace){
char* result;
size_t l = strlen(source), r = strlen(replace), i;
int number_of_replaces = 0;
for(i = 0; i < l; i++){
if(source == search)
number_of_replaces++;
}
result = malloc((l - number_of_replaces) + number_of_replaces *
r + 1);
i = 0;
while(i < l){
if(source == search){
int j;
for(j = 0; j < r; j++){
result = replace[j];
i++;
}
}else{
result = source;
i++;
}
}
result = 0;
return result;
}

later I can then use it like:

char source[] = "1234?5678?9";
char search = '?';
char* replace = "***";

char* result = search_and_replace(source,search,replace);
free(result);

and I will end up result being "1234***5678***9"

Other than the lack of some error checkings, the above function seems
to work ok but looks like it's doing alot of work. Can you think of
anyway to improve the speed of the function?

Thanks!
 
E

Eric Sosman

pembed2003 said:
Hi all,
I need to write a function to search and replace part of a char*
passed in to the function. I came up with the following:
[code snipped; see up-thread]

later I can then use it like:

char source[] = "1234?5678?9";
char search = '?';
char* replace = "***";

char* result = search_and_replace(source,search,replace);
free(result);

and I will end up result being "1234***5678***9"

You will? That's not the result I'd expect to get.
Have you actually tried running this?
Other than the lack of some error checkings, the above function seems
to work ok but looks like it's doing alot of work. Can you think of
anyway to improve the speed of the function?

It does not "work ok." Full stop. Any attempt to
improve the speed simply arrives at the wrong answer sooner.
If that's your goal, the function can be made *much* faster.
 
P

pembed2003

Eric Sosman said:
pembed2003 said:
Hi all,
I need to write a function to search and replace part of a char*
passed in to the function. I came up with the following:
[code snipped; see up-thread]

later I can then use it like:

char source[] = "1234?5678?9";
char search = '?';
char* replace = "***";

char* result = search_and_replace(source,search,replace);
free(result);

and I will end up result being "1234***5678***9"

You will? That's not the result I'd expect to get.
Have you actually tried running this?
Other than the lack of some error checkings, the above function seems
to work ok but looks like it's doing alot of work. Can you think of
anyway to improve the speed of the function?

It does not "work ok." Full stop. Any attempt to
improve the speed simply arrives at the wrong answer sooner.
If that's your goal, the function can be made *much* faster.

Hi Eric,
Thanks for your comment and yes it does NOT work. Sorry about the
false statement. I have made some change to it and the function now
become:

char* search_and_replace(char* source,char search,char* replace){

char* result;

size_t l = strlen(source), r = strlen(replace), i, k;

int number_of_replaces = 0;

for(i = 0; i < l; i++){
if(source == search)
number_of_replaces++;
}

result = malloc((l - number_of_replaces) +
number_of_replaces * r + 1);
i = 0; k = 0;

while(k < l){
if(source[k] == search){
int j;
for(j = 0; j < r; j++){
result[i++] = replace[j];
}
}else{
result[i++] = source[k];
}
k++;
}

result = 0;
return result;
}

Can you tell me how I can improve the function in terms of speed?

Thanks!
 
E

Eric Sosman

pembed2003 said:
Eric Sosman said:
pembed2003 said:
Hi all,
I need to write a function to search and replace part of a char*
passed in to the function. I came up with the following:
[code snipped; see up-thread]

later I can then use it like:

char source[] = "1234?5678?9";
char search = '?';
char* replace = "***";

char* result = search_and_replace(source,search,replace);
free(result);

and I will end up result being "1234***5678***9"

You will? That's not the result I'd expect to get.
Have you actually tried running this?

Other than the lack of some error checkings, the above function seems
to work ok but looks like it's doing alot of work. Can you think of
anyway to improve the speed of the function?

It does not "work ok." Full stop. Any attempt to
improve the speed simply arrives at the wrong answer sooner.
If that's your goal, the function can be made *much* faster.


Hi Eric,
Thanks for your comment and yes it does NOT work. Sorry about the
false statement. I have made some change to it and the function now
become:

I haven't tested it, but it looks like you've corrected
the obvious error in the first version.

The C language Standard says nothing about speed, and
different implementations on different machines will behave --
well, differently. The suggestions below are not guaranteed
to improve the speed, but might be worth testing. There are
also a few stylistic and "good practices" suggestions not
related to possible speed improvements.
char* search_and_replace(char* source,char search,char* replace){

The function does not modify either of the strings pointed
to by the first and third arguments, so I'd suggest declaring
them as `const char *source' and `const char *replace'.
char* result;

size_t l = strlen(source), r = strlen(replace), i, k;

int number_of_replaces = 0;

for(i = 0; i < l; i++){
if(source == search)
number_of_replaces++;
}


Another way to search for the replacement character
would be to use the strchr() function in a loop, more
or less like this:

const char *p;
for (p = source; (p = strchr(p, search)) != NULL; ++p)
++number_of_replaces;

This might or might not be faster. At a guess, the
greatest speed advantage (if any) would probably occur
with very long `source' strings containing very few
`search' characters.
result = malloc((l - number_of_replaces) +
number_of_replaces * r + 1);

malloc() can return NULL if it fails to find enough
memory, and you should check for this possibility. As
the code stands, you've got a bug waiting to happen.
i = 0; k = 0;

while(k < l){
if(source[k] == search){
int j;
for(j = 0; j < r; j++){
result[i++] = replace[j];
}

Instead of copying the `replace' string one character
at a time, it might be faster to use memcpy():

memcpy(result + i, replace, r);
i += r;

At a guess, this would probably not improve matters
for a short `replace' string but might be faster if `replace'
is fairly long.
}else{
result[i++] = source[k];
}
k++;
}

The outer loop could be rewritten to use strchr(), as
above but with a little extra bookkeeping to remember how
many non-`search' characters need copying. And the copying
itself could be done with memcpy() instead of one character
at a time. If there's any speed advantage to doing this,
I'd expect it to occur under much the same conditions as
before: a long `source' with few appearances of `search'.
result = 0;
return result;
}

Can you tell me how I can improve the function in terms of speed?


Before you go running off and start coding I'd like to
draw your attention to a lesson that this experience should
already have taught you and to another lesson long experience
has taught others.

First, consider whether your "need for speed" is a Good
Thing or a Bad Thing. The salient fact here is that the
first version of your code was flat-out wrong, R-O-N-G, wrong.
Yet your enthusiasm for greater speed completely blinded you
to this little matter ... On balance, one would have to say
that your devotion to what Henry Spencer calls The Little Tin
God is *not* a Good Thing, at least not at this point in your
development. I recommend some self-examination.

Second, it has been observed -- it has even been tested
under laboratory conditions -- that even very good programmers
are startlingly bad at predicting which parts of their programs
will consume the most time. Decades of experience on thousands
of programming projects have come up with an outline of how to
get the best performance:

0. Make sure you're writing the right program. For
example, do you really want a function that replaces
single characters with strings, or would it be more
useful to replace strings with strings?

1. Choose appropriate algorithms. If you must search a
table of five or six entries, just loop through an
array -- but if the table has five or six thousand
entries you'd better use another approach entirely.

2. Write the program, implementing the chosen algorithms.
The focus here should be on simplicity, correctness,
and robustness. If the word "speed" so much as pops
into your head, go take a nap.

3. Debug the program. I've been programming professionally
for three decades, yet I nearly always have at least
one bug in any program longer than about a thousand
lines of new-minted code. If you're ten times as good
as I am, you will still write bugs and need to fix them.
Fix them.

4. *Measure* the performance of your program. If the
performance is adequate, *stop*. "If it ain't broke,
don't fix it." You have better things to do than gild
lilies.


5. *If* the performance is inadequate, use a profiler or
other tools to *measure* where the time is going. You
may study a piece of code and find that optimization
could eliminate ninety percent of its running time --
but if it only consumes ten microseconds per run of the
program, who cares? When you've identified a piece of
slow code, ask yourself this: "If I could eliminate *all*
the time consumed by this fragment, how fast would the
program run?" If the answer is "Still not fast enough,"
you need to look elsewhere.

6. Optimize the code sections whose unacceptable performance
you have *measured*, and then *measure* the improvement.
You may be surprised: Sometimes an "optimization" makes
things worse -- but if you fail to measure both before
and after, you'll never know.

7. Immediately after optimizing, go all the way back to
Step Three. Herein fail not. I have seen plenty of
working programs broken by "optimization;" I have
broken more than a few myself.

Donald Knuth wrote that "Premature optimization is the
root of all evil." Michael Jackson expressed a similar idea
in the form of a pair of Laws:

First Law Of Program Optimization:
Don't do it.

Second Law Of Program Optimization (for experts only):
Don't do it yet.

Ponder the message, and ignore it at your peril.
 
P

pembed2003

[snip]
Donald Knuth wrote that "Premature optimization is the
root of all evil." Michael Jackson expressed a similar idea
in the form of a pair of Laws:

First Law Of Program Optimization:
Don't do it.

Second Law Of Program Optimization (for experts only):
Don't do it yet.

Ponder the message, and ignore it at your peril.

Hi Eric,
Thanks for your time to educate me and teach me a few lessons.
Obviously, you have a lot more professional programming experience
(three decades) than I have (not even 3 years yet) so I value your
suggestions. I think one of the thing I forget to mention (which I
should have) in my previous replies is that I am trying to learn the
algorithm. Sometimes when I encounter an API from the language, I
always wonder how the function is implemented. I will ask myself the
question: "If I were to implement the same function, how would I go
about doing it?" So I tried to come up with my own implementation as
an exercise but since I have so little professional programming
experience (compare to yours for example), I questioned my solution
and tried to seek out what the other (hopefully better) programmer to
see what they might do. Thanks!
 
K

kal

... the function now become:

... the function now becomes:
char* search_and_replace(char* source,char search,char* replace){

char *search_and_replace(const char *source, const char search,
const char *replace){
char* result;

char *result, *p, *q, *s, c;
size_t l = strlen(source), r = strlen(replace), i, k;
int number_of_replaces = 0;

size_t l = 1, r = strlen(replace);
for(i = 0; i < l; i++){
if(source == search)
number_of_replaces++;
}


for(p=source; c=*(p++); ){
if (c==search) l += r; else l++;
}
result = malloc((l - number_of_replaces) +
number_of_replaces * r + 1);

if ((result = malloc(l)) == NULL)
return result;
i = 0; k = 0;
while(k < l){
if(source[k] == search){
int j;
for(j = 0; j < r; j++){
result[i++] = replace[j];
}
}else{
result[i++] = source[k];
}
k++;
}

p=source, q=result;
while (c = *(p++)) {
if (c == search)
for (s=replace; *q = *(s++); q++);
else *(q++) = c;
}
*q = 0;
result = 0;
return result;


return result;

}
Can you tell me how I can improve the function in terms of speed?

Run it on a faster computer or on a computer that is
at a higher elevation than you are?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,772
Messages
2,569,593
Members
45,111
Latest member
VetaMcRae
Top