How to extract a string starting with 'abc' & ending with 'xyz' ?

U

Umesh

I modified the program in this way for my understanding. It works but
displays "(null)" in every line it fails to find out abc*xyz. What
should i do to stop that?
// find a string starting with abc and ending with xyz

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);



}


return 0;

}
 
U

Umesh

I modified the program in this way for my understanding. It works but
displays "(null)" in every line it fails to find out abc*xyz. What
should i do to stop that?
// find a string starting with abc and ending with xyz

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);



}


return 0;

}
 
E

Eric Sosman

Umesh wrote On 05/29/07 17:45,:
I modified the program in this way for my understanding. It works but
displays "(null)" in every line it fails to find out abc*xyz. What
should i do to stop that?
// find a string starting with abc and ending with xyz

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);



}


return 0;

}

What you should do is learn what { and } do, and
how they control what an `if' governs.

You might also want to brush up on what fgets()
does: What you have is all right, but suggests that
you don't know why.

Also, there will be trouble if either of the fopen()
calls should fail. Obey the Sixth Commandment!

http://www.lysator.liu.se/c/ten-commandments.html
 
K

Keith Thompson

Umesh said:
I modified the program in this way for my understanding. It works but
displays "(null)" in every line it fails to find out abc*xyz. What
should i do to stop that?
// find a string starting with abc and ending with xyz

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);



}


return 0;

}

You posted the same article twice. I think Google Groups is having
some sort of problem that causes this kind of error. Please complain
to them.

The compiler ignores indentation; it's used only to make the code
clearer to a human reader, but if the indentation doesn't match the
actual structure of the code, it just causes confusion.

Consider the statements within the body of the while loop:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

As far as the compiler is concerned, this is exactly equivalent to this:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

which, if it's indented *properly*, looks like this:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

With proper indentation, you can see that if abc is equal to NULL, and
xyz is not equal to NULL, you execute the fprintf statement. Passing
a null pointer to fprintf for a "%s" format invokes undefined
behavior; in your implementation, it happens to print "(null)", but it
could do anything.

What you *probably* want is something like this:

abc = strstr(buf, "abc");
if (abc != NULL)
{
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);
}

but I haven't studied your program's logic closely enough to be sure.

To avoid problems like this, you should consider using something that
formats and indents your code for you. The "indent" tool, if you have
it, is pretty good; I don't use it much myself, but the "-kr" option
gives reasonable output. There are also editors that will format your
code for you as you type it.

Also, I recommend *always* using braces for control structures (if,
while, for), even when they just control a single statement. For
example, rather than this:

if (abc != NULL)
xyz = strstr(abc + 3, "xyz");

I'd write this:

if (abc != NULL) {
xyz = strstr(abc + 3, "xyz");
}

(I picked up this habit from Perl, which requires the braces; C
doesn't, but I still find it helpful, especially if I want to add a
second statement.)

Plenty of knowledgeable people are going to disagree with this advice;
you'll have to decide for yourself whether to follow it. But you need
to do *something* to make sure that your indentation matches the
actual logic of your program.
 
U

Umesh

So the simplified program for my understanding is as follows:
But it returns words/strings containing '.' (full stop), ',' (comma)
etc which is not intended at all. How to prevent it from doing so?

// find a string stating with abc and ending with xyz WORKING

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
{
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);
}


}


return 0;

}
Keith said:
Umesh said:
I modified the program in this way for my understanding. It works but
displays "(null)" in every line it fails to find out abc*xyz. What
should i do to stop that?
// find a string starting with abc and ending with xyz

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);



}


return 0;

}

You posted the same article twice. I think Google Groups is having
some sort of problem that causes this kind of error. Please complain
to them.

The compiler ignores indentation; it's used only to make the code
clearer to a human reader, but if the indentation doesn't match the
actual structure of the code, it just causes confusion.

Consider the statements within the body of the while loop:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

As far as the compiler is concerned, this is exactly equivalent to this:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

which, if it's indented *properly*, looks like this:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

With proper indentation, you can see that if abc is equal to NULL, and
xyz is not equal to NULL, you execute the fprintf statement. Passing
a null pointer to fprintf for a "%s" format invokes undefined
behavior; in your implementation, it happens to print "(null)", but it
could do anything.

What you *probably* want is something like this:

abc = strstr(buf, "abc");
if (abc != NULL)
{
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);
}

but I haven't studied your program's logic closely enough to be sure.

To avoid problems like this, you should consider using something that
formats and indents your code for you. The "indent" tool, if you have
it, is pretty good; I don't use it much myself, but the "-kr" option
gives reasonable output. There are also editors that will format your
code for you as you type it.

Also, I recommend *always* using braces for control structures (if,
while, for), even when they just control a single statement. For
example, rather than this:

if (abc != NULL)
xyz = strstr(abc + 3, "xyz");

I'd write this:

if (abc != NULL) {
xyz = strstr(abc + 3, "xyz");
}

(I picked up this habit from Perl, which requires the braces; C
doesn't, but I still find it helpful, especially if I want to add a
second statement.)

Plenty of knowledgeable people are going to disagree with this advice;
you'll have to decide for yourself whether to follow it. But you need
to do *something* to make sure that your indentation matches the
actual logic of your program.

--
Keith Thompson (The_Other_Keith) (e-mail address removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
U

Umesh

To be honest I didn't understand the program logic espectially the
meaning of this line:
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

So the simplified program for my understanding is as follows:
But it returns words/strings containing '.' (full stop), ',' (comma)
etc which is not intended at all. How to prevent it from doing so?

// find a string stating with abc and ending with xyz WORKING

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
{
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);
}


}


return 0;

}
Keith said:
Umesh said:
I modified the program in this way for my understanding. It works but
displays "(null)" in every line it fails to find out abc*xyz. What
should i do to stop that?
// find a string starting with abc and ending with xyz

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);



}


return 0;

}

You posted the same article twice. I think Google Groups is having
some sort of problem that causes this kind of error. Please complain
to them.

The compiler ignores indentation; it's used only to make the code
clearer to a human reader, but if the indentation doesn't match the
actual structure of the code, it just causes confusion.

Consider the statements within the body of the while loop:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

As far as the compiler is concerned, this is exactly equivalent to this:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

which, if it's indented *properly*, looks like this:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

With proper indentation, you can see that if abc is equal to NULL, and
xyz is not equal to NULL, you execute the fprintf statement. Passing
a null pointer to fprintf for a "%s" format invokes undefined
behavior; in your implementation, it happens to print "(null)", but it
could do anything.

What you *probably* want is something like this:

abc = strstr(buf, "abc");
if (abc != NULL)
{
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);
}

but I haven't studied your program's logic closely enough to be sure.

To avoid problems like this, you should consider using something that
formats and indents your code for you. The "indent" tool, if you have
it, is pretty good; I don't use it much myself, but the "-kr" option
gives reasonable output. There are also editors that will format your
code for you as you type it.

Also, I recommend *always* using braces for control structures (if,
while, for), even when they just control a single statement. For
example, rather than this:

if (abc != NULL)
xyz = strstr(abc + 3, "xyz");

I'd write this:

if (abc != NULL) {
xyz = strstr(abc + 3, "xyz");
}

(I picked up this habit from Perl, which requires the braces; C
doesn't, but I still find it helpful, especially if I want to add a
second statement.)

Plenty of knowledgeable people are going to disagree with this advice;
you'll have to decide for yourself whether to follow it. But you need
to do *something* to make sure that your indentation matches the
actual logic of your program.

--
Keith Thompson (The_Other_Keith) (e-mail address removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
S

Steve Thompson

I actually want to find words starting with a and ending with b in a
text file and put the output in a file. So there will be no spaces
between the words. .

All of these text searches you have asked about can be solved
by a simple technique called a "state machine".

set state to 0
while (inchar = getc()) != EOF { /* beginning of a line */
if machinestate is 0 { [snip]
All is vanity. -- Ecclesiastes

That's some mighty fine help you've got there. Unfortunately, a state
machine in this instance is a little more like a solution looking for a
problem. I'd suggest you have your man advance the ignition timing about
..3 metric smidgeons before he makes another post like this.

If the OP is using something UNIX-like, a better solution uses strstr() or
strchr() (or index()) twice. Of course there are special cases where a
custom algorigthm is more appropriate, but the apparent sophistication of
the original questioner suggests that the generic c-libary solution is
best.


Regards,

Steve
 
S

Steve Thompson

Yes. But the OP still doesn't understand something as fundamental as
detecting EOF.

That's so understandable: EOF is such a subjective thing. Who among us
is so wise as to be able to say where a file begins and where it ends?

When you factor in the dynamic nature of 'files' in a complex multi-user
system, it is often difficult to be assured -- as a programmer -- that the
file one is reading and writing is actually a static object. In my
experience, the file must be considered something of a moving target and
the software written to account for this fact.
This despite the fact that he has been posting since December 2006.

Yet after all this time you still listen and then post replies.


Regards,

Steve
 
U

Umesh

How to modify it so that it stops searcing for strings containing
'uvw' ?
// find a string stating with abc and ending with xyz WORKING

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
{
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);
}


}


return 0;

}
Keith said:
Umesh said:
I modified the program in this way for my understanding. It works but
displays "(null)" in every line it fails to find out abc*xyz. What
should i do to stop that?
// find a string starting with abc and ending with xyz

#define SIZE 1000
#include <stdio.h>
#include <string.h>
int main(void)
{
int status;
FILE *infp, *outfp;
char buf[SIZE+1];
char *abc, *xyz;
infp = fopen("c:/1.txt", "r");

outfp = fopen("c:/2.txt", "w");

while (fgets(buf,SIZE,infp))
{
abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);



}


return 0;

}

You posted the same article twice. I think Google Groups is having
some sort of problem that causes this kind of error. Please complain
to them.

The compiler ignores indentation; it's used only to make the code
clearer to a human reader, but if the indentation doesn't match the
actual structure of the code, it just causes confusion.

Consider the statements within the body of the while loop:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

As far as the compiler is concerned, this is exactly equivalent to this:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

which, if it's indented *properly*, looks like this:

abc = strstr(buf, "abc");
if (abc != NULL)
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);

With proper indentation, you can see that if abc is equal to NULL, and
xyz is not equal to NULL, you execute the fprintf statement. Passing
a null pointer to fprintf for a "%s" format invokes undefined
behavior; in your implementation, it happens to print "(null)", but it
could do anything.

What you *probably* want is something like this:

abc = strstr(buf, "abc");
if (abc != NULL)
{
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);
}

but I haven't studied your program's logic closely enough to be sure.

To avoid problems like this, you should consider using something that
formats and indents your code for you. The "indent" tool, if you have
it, is pretty good; I don't use it much myself, but the "-kr" option
gives reasonable output. There are also editors that will format your
code for you as you type it.

Also, I recommend *always* using braces for control structures (if,
while, for), even when they just control a single statement. For
example, rather than this:

if (abc != NULL)
xyz = strstr(abc + 3, "xyz");

I'd write this:

if (abc != NULL) {
xyz = strstr(abc + 3, "xyz");
}

(I picked up this habit from Perl, which requires the braces; C
doesn't, but I still find it helpful, especially if I want to add a
second statement.)

Plenty of knowledgeable people are going to disagree with this advice;
you'll have to decide for yourself whether to follow it. But you need
to do *something* to make sure that your indentation matches the
actual logic of your program.

--
Keith Thompson (The_Other_Keith) (e-mail address removed) <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
B

Barry Schwarz

How to modify it so that it stops searcing for strings containing
'uvw' ?

snip 150 lines

While not wishing to respond to the troll, I think it appropriate to
warn new students of the language not to use his code as an example of
working code or good programming practice or even usenet etiquette.


Remove del for email
 
W

Walter Roberson

// find a string stating with abc and ending with xyz WORKING
abc = strstr(buf, "abc");
if (abc != NULL)
{
xyz = strstr(abc + 3, "xyz");
if (xyz != NULL)
fprintf (outfp,"%.*s\n",(int)(xyz + 3 - abc), abc);
}
[/QUOTE]

That doesn't find strings beginning with abc and ending with xyz:
that finds strings beginning with abc and having xyz in them anywhere,
and prints out the portion from the abc to the xyz.

If the goal is to extract from abc up to the first xyz, then the
phrase to use is "find substrings starting with abc and ending with xyz".
When you say that you want to "find strings [...] ending with xyz", then
the implication is that you are examining a complete string and the
last thing in that string is xyz.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top