Weird Issue

N

ncf

Ok, I've been tring to resolve this issue for some time now (~1 day
which is way longer than normal for me) to no avail.

I am reading a file into a list in memory, using a "%" delimited file
format (which allows for comments on lines beginning with "#"), and
some of the messages are not correctly copied.

I believe the problem may be somewhere around the strcpy() call, but am
not sure at all.

I hope somebody out there is able to help me, no matter how cruddy my
code may be. :\
-Wes



==================== MessagesFile.c ====================
#include <stdio.h>
#include <string.h>

#define BUFFLEN 1024

#define same(x,y) strcmp(x,y)==0

int num_messages = 0;
char messages[][BUFFLEN] = {};

char buffer[BUFFLEN];

void ProcessLine(char *s) {
printf("ProcessLine(\"%s\")\n",s);

if (!strlen(s)) { /* Empty line */
} else if (s[0]=='#') { /* Comments begin with '#' */
} else if (same(s,"%")) { /* '%' is a message divider */
if (!strlen(buffer)) {
return;
}
/* Add the message */
strcpy(messages[num_messages++], buffer);

/* Clear the buffer */
int i;
for (i=0; i<BUFFLEN; i++) {
buffer = '\0';
}
} else { /* Else, message */
int true_len;
if (strlen(buffer)) {
true_len = (int)snprintf(buffer, sizeof(buffer), "%s\n%s", buffer,
s);
} else {
true_len = (int)snprintf(buffer, sizeof(buffer), "%s", s);
}
if (true_len > BUFFLEN) {
printf("Message %d truncated from %d bytes to %d bytes.",
num_messages+1, true_len, BUFFLEN);
}
}
}

int main()
{
FILE *myfile;

if ((myfile = fopen("messages.txt","r"))==NULL) {
printf("Sorry, but I failed to open the file for reading.");
return 1;
}

int pos = 0;
char c;
char line[BUFFLEN] = "";
while ( (c=fgetc(myfile)) != EOF) {
if (c=='\n' || c=='\r') {
ProcessLine(line);
for (pos=0; pos<BUFFLEN; pos++)
line[pos] = '\0';
pos = 0;
} else {
line[pos++] = c;
}
}
if (strlen(line)) {
ProcessLine(line); /* first we dump whatever remaining buffer we have
into the process function */
}
ProcessLine("%"); /* Then, for goodness sake, we make sure that
whatever buffer remaining is processed. */

fclose(myfile);

printf("\n\n========\nMessages Dump\n========\n");
int i;
for (i=0; i<num_messages; i++) {
printf("Message[%d]=%s\n", i, messages);
}

printf("Last Char Val=%d/%c",EOF,EOF);

return(0);
}



==================== messages.txt (my test file) ====================
# comment!!!
meow mix
%
hi?
%
woot!
%
meow
%
Next message consists of exactly 1023 characters (leaving 1 for the
ending \0 :p)
%
aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabb
%
And this message is
a multi-line message.
Woot!
%
A simple message with no end delimiter



==================== Output of ``gcc MessageFile.c -Wall -o MessageFile
&& time ./MessageFile && echo $? ====================
ProcessLine("# comment!!!")
ProcessLine("meow mix")
ProcessLine("%")
ProcessLine("hi?")
ProcessLine("%")
ProcessLine("woot!")
ProcessLine("%")
ProcessLine("meow")
ProcessLine("%")
ProcessLine("Next message consists of exactly 1023 characters (leaving
1 for the ending \0 :p)")
ProcessLine("%")
ProcessLine("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabb")
ProcessLine("%")
ProcessLine("And this message is")
ProcessLine("a multi-line message.")
ProcessLine("Woot!")
ProcessLine("%")
ProcessLine("A simple message with no end delimiter")
ProcessLine("%")


========
Messages Dump
========
Message[0]=meow mix
Message[1]=
Message[2]=woot!
Message[3]=meow
Message[4]=Next message consists of exactly 1023 characters (leaving 1
for the ending \0 :p)
Message[5]=aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaabb
Message[6]=
Woot!
Message[7]=A simple message with no end delimiter
Last Char Val=-1/ÿ
real 0m0.004s
user 0m0.000s
sys 0m0.000s
0
 
J

Jack Klein

Ok, I've been tring to resolve this issue for some time now (~1 day
which is way longer than normal for me) to no avail.

I am reading a file into a list in memory, using a "%" delimited file
format (which allows for comments on lines beginning with "#"), and
some of the messages are not correctly copied.

I believe the problem may be somewhere around the strcpy() call, but am
not sure at all.

I hope somebody out there is able to help me, no matter how cruddy my
code may be. :\
-Wes

I haven't spent a lot of time looking at your code, but it does
produce undefined behavior in several places.
==================== MessagesFile.c ====================
#include <stdio.h>
#include <string.h>

#define BUFFLEN 1024

#define same(x,y) strcmp(x,y)==0

int num_messages = 0;

Do you know that since file scope objects have static storage
duration, they are automatically zero initialized unless you provide a
specific initializer? The '= 0' is superfluous.
char messages[][BUFFLEN] = {};

The line above is not a legal definition with initialization. If you
changed '{]' to '{0}', it would be. It would define 'messages' as an
array of one array of BUFFLEN characters. I have no idea what your
compiler's non-standard behavior does with this due to the illegal
initializer, but I seriously doubt that it is anything much different
from:

char messages[1][BUFFLEN];
char buffer[BUFFLEN];

void ProcessLine(char *s) {
printf("ProcessLine(\"%s\")\n",s);

if (!strlen(s)) { /* Empty line */
} else if (s[0]=='#') { /* Comments begin with '#' */
} else if (same(s,"%")) { /* '%' is a message divider */
if (!strlen(buffer)) {
return;
}
/* Add the message */
strcpy(messages[num_messages++], buffer);

The second time you call this function, you are writing past the end
of 'messages'. Undefined behavior.
/* Clear the buffer */
int i;
for (i=0; i<BUFFLEN; i++) {
buffer = '\0';
}


memset() would make a lot more sense here.
} else { /* Else, message */
int true_len;
if (strlen(buffer)) {
true_len = (int)snprintf(buffer, sizeof(buffer), "%s\n%s", buffer,
s);

You are passing 'buffer' as both the destination and a source string
to snprintf(), more undefined behavior.
} else {
true_len = (int)snprintf(buffer, sizeof(buffer), "%s", s);
}
if (true_len > BUFFLEN) {
printf("Message %d truncated from %d bytes to %d bytes.",
num_messages+1, true_len, BUFFLEN);
}
}
}

int main()
{
FILE *myfile;

if ((myfile = fopen("messages.txt","r"))==NULL) {
printf("Sorry, but I failed to open the file for reading.");
return 1;
}

int pos = 0;
char c;
char line[BUFFLEN] = "";

You define line as an array of BUFFLEN characters, all '\0'.
while ( (c=fgetc(myfile)) != EOF) {
if (c=='\n' || c=='\r') {
ProcessLine(line);

If a line in your input file is too long, you will overflow 'line' and
produce more undefined behavior. But let's ignore that possibility
for the moment.

Let's say the first three lines in the file are:

abc
abcdefg
123

The first time you call 'ProcessLine', you will pass it "abc" followed
by BUFFLEN - 3 '\0' characters. The second time, you will pass it
"abcdefg" followed by BUFFLEN - 7 '\0' characters. In fact as long as
each new line is as long or longer than the one before it, but not too
long to overflow the array, you are passing what you think you are
passing.

But when you read the third line, you will call your function with a
pointer to "123defg" followed by BUFFLEN - 7 '\0' characters.
for (pos=0; pos<BUFFLEN; pos++)
line[pos] = '\0';
pos = 0;
} else {
line[pos++] = c;
}
}
if (strlen(line)) {
ProcessLine(line); /* first we dump whatever remaining buffer we have
into the process function */
}
ProcessLine("%"); /* Then, for goodness sake, we make sure that
whatever buffer remaining is processed. */

fclose(myfile);

printf("\n\n========\nMessages Dump\n========\n");
int i;
for (i=0; i<num_messages; i++) {
printf("Message[%d]=%s\n", i, messages);
}

printf("Last Char Val=%d/%c",EOF,EOF);

return(0);
}


There seem to be quite a few things here that need fixing. Perhaps
your problem will go away when you do.
 
K

Keith Thompson

Jack Klein said:
On 23 Sep 2005 17:08:00 -0700, "ncf" <[email protected]>
wrote in comp.lang.c: [...]
int num_messages = 0;

Do you know that since file scope objects have static storage
duration, they are automatically zero initialized unless you provide a
specific initializer? The '= 0' is superfluous.

Strictly speaking, yes it is, but it should be harmless. Any decent
compiler should be smart enough to recognize a zero initialization and
generate the same code as if it were left implicit.

Personally, I think it's good style to make the initialization
explicit if later code is going to depend on it. Consider what
happens if you later decide to move the declaration into a function.
 
N

ncf

Err, yea, Sorry for all the errors, I'm rather new to C. :roll:

Thanks for the help and hopefully all this will go away.

Have a GREAT day
-Wes
 
N

ncf

Err, yea, myself being new to C, I had such a problem in my main()
function that reads from the file. I was getting segfaults right and
left before I finally inserted a bunch of variable dumping lines to
find out that my counter was initialized at some random value. :p

None the less, thanks for the clearification as to why it's not needed
in the global scope but init is almost mandatory in the function.

-Wes
 
N

ncf

Ok, I've been working through reading your suggestions and trying to
alter my code approiately in hopes that it will start working. Being
relatively new to C, I have been left with a few questions.

""" The first time you call 'ProcessLine', you will pass it "abc"
followed by BUFFLEN - 3 '\0' characters. The second time, you will
pass it "abcdefg" followed by BUFFLEN - 7 '\0' characters. In fact as
long as each new line is as long or longer than the one before it, but
not too long to overflow the array, you are passing what you think you
are passing. """

However, I did do one of the memset-style things (now converted to a
memset) after the ProcessLine() call, wouldn't that avoid the problem
you were mentioning?

"""You are passing 'buffer' as both the destination and a source string
to snprintf(), more undefined behavior. """

So...how would I go about appending to a string then? :confused:

""" char messages[][BUFFLEN] = {0}; """
After I applied that change, I recieved a warnings from gcc:
MessageFile.c:15: warning: missing braces around initializer
MessageFile.c:15: warning: (near initialization for `messages[0]')




Thanks for your assistance with this rudimentary code.
 
M

Mark McIntyre

in 24 Sep 2005 12:03:26 -0700, in comp.lang.c , "ncf"
""" char messages[][BUFFLEN] = {0}; """
After I applied that change, I recieved a warnings from gcc:
MessageFile.c:15: warning: missing braces around initializer
MessageFile.c:15: warning: (near initialization for `messages[0]')

Two things - a) the first dimension of an array object should not be
zero, except in a function definition and b) you may need a second set
of braces round the zero {{0}} as some compilers seem to prefer one
set of braces for each dimension.
 
N

ncf

Now this is getting confusing. So which would be moreso correct.
*confused*
char messages[][BUFFLEN] = {};

The line above is not a legal definition with initialization. If you
changed '{]' to '{0}', it would be. It would define 'messages' as an
array of one array of BUFFLEN characters. I have no idea what your
compiler's non-standard behavior does with this due to the illegal
initializer, but I seriously doubt that it is anything much different
from:

char messages[1][BUFFLEN];

Also, anyway I define `char messages[][BUFFLEN]`, I am getting an issue
with the element I'm trying to drop in at [1]. Adding lines to the
beginning of the file only changes which element is located at that
index and, consequently, dropped.

The only way I have found to stop the dropped element problem at index
[1] is to define the variable as `char messages[100][BUFFLEN]` or such,
wherein the array isn't as flexible as it once was.

Can anyone explain to me which is right and which is wrong (beginning
of this message) and why the messages array is acting this way? :\

-Wes
 
M

Mark McIntyre

char messages[1][BUFFLEN];

Also, anyway I define `char messages[][BUFFLEN]`, I am getting an issue
with the element I'm trying to drop in at [1].

The above array doesn't have an element 1 - it has one element,
number zero. This may explain your problem.

strcpy(messages[1], "foo") ; // can't do this.
The only way I have found to stop the dropped element problem at index
[1] is to define the variable as `char messages[100][BUFFLEN]` or such,
wherein the array isn't as flexible as it once was.

However it /is/ a legal C construct, unlike the other use!
Can anyone explain to me which is right and which is wrong (beginning
of this message) and why the messages array is acting this way? :\

If you want to read an unknown number of messages into an array of
buffers, you'll have to use malloc and realloc to create and resize
your arrays.
 
O

Old Wolf

ncf said:
Also, anyway I define `char messages[][BUFFLEN]`, I am getting an issue
with the element I'm trying to drop in at [1]. Adding lines to the
beginning of the file only changes which element is located at that
index and, consequently, dropped.

The only way I have found to stop the dropped element problem at index
[1] is to define the variable as `char messages[100][BUFFLEN]` or such,
wherein the array isn't as flexible as it once was.

You seem to think that "char messages[][BUFFLEN]" is some sort of
automatically-resizing structure. It isn't. It is effectively the
same as:

char messages[1][BUFFLEN]

ie. an array with only one row. Even less flexible than an array
with 100 rows.
 
N

ncf

Mark & Old Wolf, thanks. Yea, coming from python, the idea that an
array is 100% fixed is sort of odd to hear at first. But ok, I guess it
makes sense now.

Mark, I believe you misread my post, but that's no prob as I now see
what I'm doing wrong. (In the section where you mentioned I couldn't
strcpy() onto [1], I had mentioned that I'm defining it as the string I
quoted from the source)

Either way, I'm slightly confused about the malloc/realloc thing, but
hopefully I can find some tutorial online somewhere.

Thanks for all yoru help
-Wes
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top