Segmentation fault when using strtok

L

lancer6238

Hi all,

I'm trying to write a program that will read in a text file and break
each line into 2 parts.

The file is in the following format:

123.123.123.1,12.12.12.1
23.23.23.23,34.34.34.34
....

Each line contains 2 IPv4 addresses separated by a comma, so I'm
trying to use strtok to extract the 2 IP addresses in each line.

Here's my code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char **argv)
{
FILE *fin;
char tbuf[40], *src, *dst;
int i = 1;

src = malloc(20);
if (!src)
{
printf("Cannot malloc src!\n");
exit(1);
}

dst = malloc(20);
if (!dst)
{
printf("Cannot malloc dst!\n");
exit(1);
}

fin = fopen("text.txt", "r");
if (!fin)
{
printf("Cannot open file!\n");
exit(1);
}

while (fgets(tbuf, 40, fin) != NULL)
{
printf("%d fgets %d = %s-\n", i++, strlen(tbuf), tbuf);
src = strtok(tbuf, ",");
dst = strtok(NULL, ",");
printf("d%d %s-", strlen(dst), dst); <-- @@@
memset(tbuf, 0, 40);
}
fclose(fin);
return 0;
}

The printf statements are to help me check if the src and dst strings
are correct.

I get the output (there are 768 lines of IP addresses in the text
file, I'm just showing output from the last few lines):

-759 fgets 27 = 56.56.567.56,56.56.56.568
-
d14 56.56.56.568
-760 fgets 26 = 12.12.123.12,12.12.12.12
-
d13 12.12.12.12
14.123.14.14,-=12.12.123.13,23.23.23.234
14.123.14.14-762 fgets 14 = 34.34.34.345
-
Segmentation fault

So it appears that strlen(dst) also includes \n and \0, Line 761 gets
partially "eaten" (12.12.123.13,23.23.23.234 is the correct 761st
line), 14.123.14.14 is the first address in Line 762, and 34.34.34.345
is the second address in Line 762.

I ran valgrind on the program, and got the error

Invalid read of size 1
at 0x4007F6: main (create_filter.c:40)
Address 0x0 is not stack'd, malloc'd or (recentlt) free'd

Process terminating with default action of signal 11 (SIGSEGV)
Access not within mapped region at address 0x0
at 0x4007F6: main (create_filter.c:40)

Line 40 is the line denoted by @@@ above. I don't understand the error
as I did malloc dst. Why am I getting this error?

This problem so far has only happened on one input file. On other
files, the correct output is obtained, and strlen(dst) only includes
\n.

I also cannot free src and dst, I'll get a segmentation fault, even
for files that did not cause the seg fault error above. Why is that?

Thank you.
 
K

Keith Thompson

Rayne said:
There are only 2 arguments, the first "d" is to be printed as "d".

But "%d" requires an int argument, and you're giving it a size_t.
 
B

Beej Jorgensen

123.123.123.1,12.12.12.1
23.23.23.23,34.34.34.34

while (fgets(tbuf, 40, fin) != NULL)
{
printf("%d fgets %d = %s-\n", i++, strlen(tbuf), tbuf);
src = strtok(tbuf, ",");
dst = strtok(NULL, ",");
printf("d%d %s-", strlen(dst), dst); <-- @@@
memset(tbuf, 0, 40);
}

-759 fgets 27 = 56.56.567.56,56.56.56.568
-
d14 56.56.56.568
-760 fgets 26 = 12.12.123.12,12.12.12.12
-
d13 12.12.12.12
14.123.14.14,-=12.12.123.13,23.23.23.234 [[ A ]]
14.123.14.14-762 fgets 14 = 34.34.34.345 [[ B ]]
-
Segmentation fault

Phew--what a mess. It's difficult to see how the above loop could even
make that output, but I think I figured it out.

Your strlen()s are returning one more than you think because the strings
contain a "\r\n" at the end. Except line 761, which just has a '\r' at
the end. The bare return is causing some of your output to be
overwritten--try redirecting it to a file and viewing the output with an
editor.

The output I think we're seeing on the lines I've marked A and B are
(I've put xx and yy for the lengths because I'm lazy):

761 fgets xx = 12.12.123.13,23.23.23.234\r14.123.14.14,-\n
dyy 23.23.23.234\r14.123.14.14-762 fgets 14 = 34.34.34.345\n

As for program flow:

Line 760:
tbuf = "12.12.123.12,12.12.12.12\r\n";
src = "12.12.123.12";
dst = "12.12.12.12\r\n";

Line 761:
tbuf = "12.12.123.13,23.23.23.234\r14.123.14.14,"; // 40 char max!
src = "12.12.123.12";
dst = "23.23.23.234\r14.123.14.14";

Line 762:
tbuf = "34.34.34.345\r\n"; // rest of line 762
src = "34.34.34.345\r\n";
dst = NULL;
strlen(dst) segfault

That's my guess! Go to line 761 and make sure there's a proper newline
at the end of it.

For the future, I'd say 1) redirect your output because you can't trust
what you see on the terminal, necessarily. 2) Add \r\n to your token
string so they get stripped out before you print them. 3) Use an editor
that shows newlines in various way.

And in this program, another option is to code it so it can handle
various newline types, or to preprocess the input so that it's sane. It
might even be a requirement, depending on how the input is made.

-Beej
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top