segmentation fault in strcmp()

M

mohangupta13

hello all,

I am writing a program to parse HTML code, it reads words (separated
by white spaces) one at a time and implements a state machine
depending upon the occurrence of various tags.

Now after reading about 12,000 words the program crashes and reports a
segmentation fault in strcmp() function some where in the code.

so what may be the causes for strcmp to cause segmentation fault , or
what do you think, is it really strcmp or some place else and is
erroneously being reported as strcmp by gdb??? (its very difficult
for me to debug it, as it occurs after about 12000 iterations)

i have used strcmp about 10 times in the code of about 1000 lines and
I am not even able to recognize the vulnerable portion so can't even
post the entire code...


all help is appreciated ...

Mohan Gupta
 
B

Ben Pfaff

mohangupta13 said:
I am writing a program to parse HTML code, it reads words (separated
by white spaces) one at a time and implements a state machine
depending upon the occurrence of various tags.

Now after reading about 12,000 words the program crashes and reports a
segmentation fault in strcmp() function some where in the code.

Almost certainly you are writing beyond the boundaries of a
section of memory. Try a memory debugger such as Valgrind or
Electric Fence, if one is available for your platform.
 
B

Beej Jorgensen

mohangupta13 said:
Now after reading about 12,000 words the program crashes and reports a
segmentation fault in strcmp() function some where in the code.

This really should be enough to figure out exactly which strcmp() is
crashing and why. I don't know how hard it is to repro the crash, but
if you can, gdb can tell you what's going on. You didn't say if you
were using vanilla gdb or some other front end, but here's what you can
do from the command line:

(Forgive me if you've already done some or all of this.)

Make sure you've compiled with -g (assuming gcc--use whatever it takes
to compile in symbolic debugging info for your particular compiler.)

Start gdb with:

gdb foobar

(where foobar is your binary name)

At the gdb prompt type "r" to run the program until it crashes. (Put
any necessary command line arguments after the "r".)

At the gdb prompt type "backtrace". This will tell you exactly where
strcmp() was called from.

-OR-

If a core file was generated by a previous crash, it becomes even
easier. Start gdb with:

gdb -c core foobar

and type "backtrace".

(If you're not getting core dumps and you want to (because maybe the
crashes are really infrequent and you can't always run the code in a
debugger), make sure you're not limiting core file size to 0. Try
"ulimit -c unlimited" at the shell prompt before you run--but the exact
command probably varies by shell.)



At this point, you might be curious why it crashed, but you're down in
strcmp()'s stack frame. Here's an example that shows how to check out
what was passed in. (I'm using strcpy() instead of strcmp() because it
was easier for me to coax a crash out of it for this demo.)

Program received signal SIGSEGV, Segmentation fault.
0xb7ea7247 in memcpy () from /lib/libc.so.6
(gdb) backtrace
#0 0xb7ea7247 in memcpy () from /lib/libc.so.6
#1 0xb7fc3e29 in memcpy () from /lib/libsafe.so.2
#2 0x080483ae in main () at foo.c:8
(gdb)

Looks like it crashed in main() in foo.c line 8. But I'm down in
memcpy(). I can use the command "up" to back up through stack frames,
like so:

(gdb) up
#1 0xb7fc3e29 in memcpy () from /lib/libsafe.so.2
(gdb) up
#2 0x080483ae in main () at foo.c:8
8 strcpy(p, "hello");
(gdb)

At this point, normal variable printing and stuff will work because
we're looking at the right frame:

(gdb) print p
$1 = 0x0

In my case p was NULL and that was causing badness.

(Note that "up" (and "down") don't perform any execution--they just
allow you to navigate the stack.)

Finally, if you're running gdb from the command line, don't forget that
the -tui switch, which brings up the terminal UI, is relatively awesome.
so what may be the causes for strcmp to cause segmentation fault

Non-terminated strings or invalid pointers passed in are the only things
I can think of (discounting some kind of freaktacular memory corruption
or something.)

Did you compile with the highest warning level set (-Wall for gcc)?
Sometimes that can catch things that cause trouble like this.

-Beej
 
J

James Dow Allen

... the program crashes and reports a
segmentation fault in strcmp() function

Assuming 10-character passwords and 26 different characters,
it may take up to 26^10 guesses to guess a password.
Using strcmp() segmentation fault, VMS hackers reduced
guess maximum to 26*10. (Details left as an exercise,
along with the trivial fix needed for the VMS password checker.)

Oh! You were getting strcmp() seg-faults you *didn't want*.
Never mind.

James Dow Allen

PS: I'm posting from Save the Dodoes Foundation today,
instead of Google Groups because the wonderful Google
programmers seem to have added a New Wonderful Feature:
It now seems *impossible* to switch Google accounts without
rebooting! (I suppose eating all the cookies would also
work, but I need to save room for wife's cooking.)
 
J

James Kuyper

mohangupta13 said:
hello all,

I am writing a program to parse HTML code, it reads words (separated
by white spaces) one at a time and implements a state machine
depending upon the occurrence of various tags.

Now after reading about 12,000 words the program crashes and reports a
segmentation fault in strcmp() function some where in the code.

so what may be the causes for strcmp to cause segmentation fault , or
what do you think, is it really strcmp or some place else and is
erroneously being reported as strcmp by gdb??? (its very difficult
for me to debug it, as it occurs after about 12000 iterations)

The problem is almost certainly occurring in strcmp(), but it's almost
equally certain that the problem is occurring because of a defect in
your code, not in the implementation of strcmp(). There's two main ways
this can happen: the simplest is that your code passes an invalid
address to strcmp(). The second is that it passes a valid address, but
that the block of memory that it points at doesn't contain a null
character, in which case strcmp() may continue reading past the end of
the block, and wander into memory you process doesn't have permission to
look at.

Note: the reason why your code is misusing strcmp() might not be clear
from examining the part of your code that performs the actual calls to
strcmp(). The defect that renders the pointer invalid or the terminating
null character missing might be in some entirely different part of your
code; the defect might consist of the fact that code you should have
written is missing, in which case it can often be impossible to say
which part of your program it is missing from.
i have used strcmp about 10 times in the code of about 1000 lines and
I am not even able to recognize the vulnerable portion so can't even
post the entire code...

Without the entire program, we can only deal in broad generalities, of
very limited usefulness. While a debugger or a leak checker may be very
useful for diagnosing a problem like this, you may end up having to take
a brute-force approach, using a copy of the code that's failing (don't
do this to your only copy of it!).

1. Remove some portion of the program that you don't think is relevant
to the problem; modify associated code as needed to make the removal clean.
2. Test to determine whether you can still reproduce the problem.
3. If you cannot reproduce the problem, reverse the changes made in step
1 (this means you should keep a copy of the code from before the
removal), and choose a different part of the program to remove.
Otherwise, keep those changes and chose a new part of the program to remove.

Repeat these step until either a) you can't find anything more to remove
or b) you figure out what the problem is. You'll be surprised at how
often option b) comes up. If, however, you end up with option a), then
your program will probably be a lot shorter than it originally was, and
there should be no problem post the complete, unaltered code to this
newsgroup, and we'll be glad to take a look at it and try to figure out
what the problem is.
 
B

BartC

mohangupta13 said:
hello all,

I am writing a program to parse HTML code, it reads words (separated
by white spaces) one at a time and implements a state machine
depending upon the occurrence of various tags.

Now after reading about 12,000 words the program crashes and reports a
segmentation fault in strcmp() function some where in the code.

so what may be the causes for strcmp to cause segmentation fault , or
what do you think, is it really strcmp or some place else and is
erroneously being reported as strcmp by gdb??? (its very difficult
for me to debug it, as it occurs after about 12000 iterations)

Put in printf statements so that you know whereabouts in the code it might
be going wrong (sometimes putting in debug statements will make the error
disappear, but you might be lucky).

12000 lots of output isn't a lot on a fast scrolling display, but you can
count iterations and only printf then you've reached 12000.

(I don't know gdb but I guess you might also be able to use this iteration
count as a breakpoint trigger.)

For strcmp to crash would be unusual, unless there was something wrong with
the inputs: null inputs or pointing to something not a string (so continues
reading beyond valid memory). If this is a possibility, use a wrapper
function around strcmp (say, debugstrcmp()), and call that instead. The
wrapper can check the inputs (not null and in expected range) and also check
the string lengths are reasonable (say not more than 100 characters), and if
not report an error. You might also add an extra parameter to identify where
it's called from (sorry this is from someone who's never used a debugger so
perhaps it can already do all this...)

It's also possible something else is corrupting the code for strcmp
(although that could be trapped on your hardware)... that's why pinpointing
this sort of error is a lot of fun.
 
D

Doug Miller

mohangupta13 said:



In KNode, do it like this:

Select Article/Post to Newsgroup...
Type the basics of your question. Then, when it gets to the point
where you need to post the entire code:

For each source or header file in the project, type the name of the
file as a clue to your readers, then do this:
Attach/Insert File...

Ummmm..... most newsservers will strip attachments from posts to non-binary
groups....
 
J

Joachim Schmitz

Lorenzo Beretta wrote:
2) use tools to help you

As for point #2, I'm surprised that nobody hinted at valgrind - sure,

Wrong, Ben did. He was the first to respond too.
 
M

mohangupta13


thanks for a lot of great advice....
I must confess that i really messed up my program a bit ....actually
I just made it incrementally in about 2 weeks so its a bit messed
up ,without much planning...
so when i used backtrace in gdb and located the bane of all the
problems and tinkered with it.....now a different problem arises ....

now i get an error like ****** free:invalid next size
(fast) .....********
at some different place. Though I am quite sure that the object being
freed at the point where this error occurs is surely allocated by
malloc and is not doubly freed ....as i can use gdb to print the
various fields using the pointer in question..(.after the crash using
backtrace)

Now can anyone please clear few doubts of mine:

1. What is the meaning of such an error like invalid next size(fast) /
invalid next size(normal) etc etc..

2. Is it really occurring because of the call to free which i get
using backtrace or the actual cause may have been long bypassed
somewhere else and it ends up showing its side effects here.


I did google a lot but no where i got a good explanation..

thanks a lot
Mohan Gupta
 
J

James Kuyper

mohangupta13 wrote:
....
now i get an error like ****** free:invalid next size
(fast) .....********
at some different place. Though I am quite sure that the object being
freed at the point where this error occurs is surely allocated by
malloc and is not doubly freed ....as i can use gdb to print the
various fields using the pointer in question..(.after the crash using
backtrace)

Now can anyone please clear few doubts of mine:

1. What is the meaning of such an error like invalid next size(fast) /
invalid next size(normal) etc etc..

It means that the free() is trying to figure out the size of the next
block of memory in the heap; information it needs in order to complete
the process of freeing the memory you've asked it to release.
Unfortunately, the piece of memory where it is looking for that
information contains an invalid value. Since the place where it looks is
determined in part by the pointer that you pass to free(), one
possibility is that you're passing the wrong pointer to free(); another
possibility is that the memory where that information was stored has
become corrupted. There's several ways in which this can happen:

If char* p=malloc(N), executing an expression like "p = expression"
could cause such a problem if i<0 || i>=N.

If p is not a pointer returned by a call to malloc() (or calloc() or
realloc()), then free(p) could cause this problem.

If p used to point into a block of memory allocated by malloc(),
calloc(), or realloc(), but that memory has since been free()d, then
free(p) could cause this problem. So could "p = expression",
regardless of the value of i.
2. Is it really occurring because of the call to free which i get
using backtrace or the actual cause may have been long bypassed
somewhere else and it ends up showing its side effects here.

While the problem is being detected in your call to free(), the actual
defect that caused the problem may have occurred long before the call to
free(), in some completely unrelated part of your program. This is what
makes problems like this so hard to track down.
 
M

Moi

In addition to other suggestions you might also replace strcmp with a
my_strcmp that is instrumented to check on the input addresses and the
input lengths.



Richard Harter, (e-mail address removed)
http://home.tiac.net/~cri, http://www.varinoma.com If I do not see as
far as others, it is because I stand in the footprints of giants.

Exactly.
And maybe make it a macro, which would give you access to __FILE__ and
__LINE__ , which would printf-debugging more usable.


HTH,
AvK
 
M

mohangupta13

Well thats really a good idea. For now the problem with strcmp is
solved a bit but I am stuck with the problem with free().
Exactly.
And maybe make it a macro, which would give you access to __FILE__ and
__LINE__ , which would printf-debugging more usable.

HTH,
AvK

I believe the real problem might be that some where in the run the
heap is being corrupted
by may be a buffer over write (possibly by one index beyond the
allocated space to a string).
Is there a way (using gdb or something else) to stop the program
exactly where it makes such an error.
I used MALLOC_CHECK_=3 in linux but it still stopped at the call to
free() only .

Thanks a lot!
Mohan Gupta
 
L

luserXtrog

I believe the real problem might be that some where in the run the
heap is being corrupted
by may be a buffer over write (possibly by one index beyond the
allocated space to a string).
Is there a way (using gdb or something else) to stop the program
exactly where it makes such an error.
I used MALLOC_CHECK_=3 in linux but it still stopped at the call to
free() only .

Perhaps other know tricks (and I hope they do), but in my experience
you really have to tear through the code yourself, manually, symbol-by
-symbol. It's like a mystery. Things to pay even closer attention to
are index calculations, and pointer logic. Knowing where the failure
occurs helps. You can start with the offending call to free and
skim backwards. Stop at every [ ] or * and really look at it.
Does it make sense? Does it really mean what you think it means?
Does it really do as it ought?

This is where consistent style pays off. If you're "in" the style,
you can begin to see "through" the punctuation; consequently the
scanning part is relatively easy. The hard part is applying the
proper scrutiny. Your brow should be furrowed, your eyes a little
squinty. You should feel a sort of tension in your teeth (but
don't grind them!). Invert the colors if you can (black->white &&
vice versa). If you use syntax highlighting, turn it off.
If you don't, turn it on. Crank some Shostakovich or Prokofiev
and pierce the depths baby!
 
N

Nick Keighley

Perhaps other know tricks (and I hope they do), but in my experience
you really have to tear through the code yourself, manually, symbol-by
-symbol.

You can wrapper malloc() and free() so that a little bit of space is
allocated before and after a block and you fill that with a known
pattern. You can then run a consistency check on your heap.
Even better people have already written such things. For instance
microsoft provide debug versions of malloc() and free()

Tools like valgrind may help. Copious assert()s may help to
spot the OOB or deref of a null pointer.

It's like a mystery. Things to pay even closer attention to
are index calculations, and pointer logic. Knowing where the failure
occurs helps. You can start with the offending call to free and
skim backwards. Stop at every [ ] or * and really look at it.

or use assert()s to make the machine do the work
Does it make sense? Does it really mean what you think it means?
Does it really do as it ought?

This is where consistent style pays off. If you're "in" the style,
you can begin to see "through" the punctuation; consequently the
scanning part is relatively easy. The hard part is applying the
proper scrutiny. Your brow should be furrowed, your eyes a little
squinty. You should feel a sort of tension in your teeth (but
don't grind them!). Invert the colors if you can (black->white &&
vice versa). If you use syntax highlighting, turn it off.
If you don't, turn it on. Crank some Shostakovich or Prokofiev
and pierce the depths baby!

automate automate automate

Machines should work, people should think
 
O

Old Wolf

Almost certainly you are writing beyond the boundaries of a
section of memory.  Try a memory debugger such as Valgrind or
Electric Fence, if one is available for your platform.

Is there any addon for GCC to do bounds checking?

For example, when I am using the Borland C compiler,
I can enable an option that will monitor all allocated
memory blocks and all array bounds etc. as the program
runs, and report an attempted overflow before it happens.
 
B

Ben Bacarisse

mohangupta13 said:
I believe the real problem might be that some where in the run the
heap is being corrupted
by may be a buffer over write (possibly by one index beyond the
allocated space to a string).
Is there a way (using gdb or something else) to stop the program
exactly where it makes such an error.

Are any of Ben Pfaff's suggestions available to you? valgrind,
electric fence (and I'll add the mudflap library) are all excellent
tools.
 
N

Nate Eldredge

Old Wolf said:
Is there any addon for GCC to do bounds checking?

For example, when I am using the Borland C compiler,
I can enable an option that will monitor all allocated
memory blocks and all array bounds etc. as the program
runs, and report an attempted overflow before it happens.

AFAIK, not in GCC per se, though Valgrind does essentially this.

GCC has some bounds checking machinery built in to its backend, but it
is apparently only supported for Java and Fortran.
 
B

Ben Bacarisse

Tor Rustad said:
http://gcc.gnu.org/gcc-4.3/changes.html

There is also the -fbounds-check, but that didn't seem to work for C
under gcc 4.2.4:

And from the 4.3.3 man page:

-fbounds-check
For front-ends that support it, generate additional code to check
that indices used to access arrays are within the declared range.
This is currently only supported by the Java and Fortran front-
ends, where this option defaults to true and false respectively.

so, no, not currently an option for C.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top