what's wrong with this code from a book?

B

ben

this programme from sedgewick's 'algorithms in c parts 1-4' "is a
sample client program that ... uses a symbol table to find the distinct
values in a sequence of keys (randomly generated or read from standard
input), then prints them out in sorted order."


#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

void main(int argc, char *argv[])
{
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
Key v;
Item item;
stInit(maxn);
for( n = 0; n < maxn; n++ ) {
if( sw )
v = itemRand();
else if( itemScan(&v) == EOF )
break;
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
}


it crashes -- Bus Error -- when there's no input on the command line.
when some arguments are given it reliably gives incorrent, rubbish
results but that's probably a problem in another part of the code not
shown.

from the code shown, it's not at all surprising that it goes wrong when
no arguments are given is it? i'm not sure if i should be changing the
code as the above code is exactly how it is in the book, but it should
be checking if argv[1] exists before using it shouldn't it?, and if it
doesn't exist then the random part should be called? or is the problem
something else, maybe not in the above code? i don't know. how does
anyone think it should be?

thanks, ben.
 
D

dandelion

ben said:
void main(int argc, char *argv[])
{
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
it crashes -- Bus Error -- when there's no input on the command line.

That does not surprize me. The code presented does not check wether or not
the arguments are actually present. In the case you mention any reference to
argv[1] or argv[2] (as above) would dereference an element of the argv[]
array that does not exist. Hence undefined behavior is the result.

Furthermore,

void main(whatever)

is incorrect. Main should return an integer.

It's easily remedied by

<code>
int main(argc, char* argv[])
{
if(argc < 3) /* argv[0] --> program, argv[1]--> first argument ... */
{
fprintf(stderr, "Insufficient arguments\n");
exit(1);
}

... etc ...

return 0;
}

from the code shown, it's not at all surprising that it goes wrong when
no arguments are given is it?

Nope. I would be surprized if it worked.

i'm not sure if i should be changing the
code as the above code is exactly how it is in the book, but it should
be checking if argv[1] exists before using it shouldn't it?

Check argc instead, which will give you the number of items in argv[].
argv[0] (by convention?POSIX) should give you the command that started your
program.
thanks, ben.

My pleasure.
 
B

ben

Check argc instead, which will give you the number of items in argv[].
argv[0] (by convention?POSIX) should give you the command that started your
program.
thanks, ben.

My pleasure.

great. i kind of knew all that but wasn't completely sure -- needed
confirming -- so thanks very much for the clarification.

still confused on part of it though -- this part in the book about data
used by the below code: "... (randomly generated or read from standard
input) ...". what would the command line call look like that makes the
below programme randomly generate the data do you think? (in fact i'm
not even sure about the command when feeding it data.
../a.out 4 2 4 6 8
seems a reasonable guess, the first number saying how many numbers
follow it, but i get useless result from that but as i said previously
i think that maybe a problem in the other code modules so that command
line might be right)

so, all that in short, how do you think this programme is supposed to
be called (to supply data, and make it randomly generate it)?

(the type of data that the support files are using are ints, but as i
say there's possibly something wrong with those files but before i
tackle those i want to make sure this part is as it should be)


#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

int main(int argc, char *argv[])
{
if(argc < 3)
fprintf(stderr, "Insufficient arguments\n"), exit(1);
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
Key v;
Item item;
stInit(maxn);
for( n = 0; n < maxn; n++ ) {
if( sw )
v = itemRand();
else if( itemScan(&v) == EOF )
break;
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
return 0;
}


thanks, ben.
 
D

dandelion

ben said:
Check argc instead, which will give you the number of items in argv[].
argv[0] (by convention?POSIX) should give you the command that started your
program.
thanks, ben.

My pleasure.

great. i kind of knew all that but wasn't completely sure -- needed
confirming -- so thanks very much for the clarification.

still confused on part of it though -- this part in the book about data
used by the below code: "... (randomly generated or read from standard
input) ...". what would the command line call look like that makes the
below programme randomly generate the data do you think? (in fact i'm
not even sure about the command when feeding it data.
./a.out 4 2 4 6 8
seems a reasonable guess, the first number saying how many numbers
follow it, but i get useless result from that but as i said previously
i think that maybe a problem in the other code modules so that command
line might be right)

so, all that in short, how do you think this programme is supposed to
be called (to supply data, and make it randomly generate it)?

(the type of data that the support files are using are ints, but as i
say there's possibly something wrong with those files but before i
tackle those i want to make sure this part is as it should be)


#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

int main(int argc, char *argv[])
{
if(argc < 3) {
fprintf(stderr, "Insufficient arguments\n"); exit(1);
}

The curly-braces *do* belong there and a semicolon ';' marks the end of
a statement, not a comma.
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
Key v;
Item item;

The declarations go *before* the first statement.
stInit(maxn);
for( n = 0; n < maxn; n++ ) {
if( sw )
v = itemRand();
else if( itemScan(&v) == EOF )
break;
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
return 0;
}

So that would become

int main(int argc, char* argv[])
{
int n, maxn;
Key v;
Item item;

if(argc < 3)
{
/* no command line args given */
fprintf(stderr, "Insufficient arguments, using random values\n");
maxn = random() * WHATEVER_MAXN/MAX_RAND;
sw = random() * WHATEVER_SW/MAX_RAND;
}
else
{
/* command line args given */
maxn = atoi(argv[1]);
sw = atoi(argv[2]);
}

/* at this point maxn and sw ae defined either way */
for( n = 0; n < maxn; n++ ) {
if( sw )
v = itemRand();
else if( itemScan(&v) == EOF )
break;
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
return 0;
}

The above will *still* break if non-numeric arguments are provided, btw. If
you want to handle that aswell, use sscanf instead of atoi and check the
results.
 
B

ben

dandelion said:
#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

int main(int argc, char *argv[])
{
if(argc < 3) {
fprintf(stderr, "Insufficient arguments\n"); exit(1); }
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
Key v;
Item item;
stInit(maxn);
for( n = 0; n < maxn; n++ ) {
if( sw )
v = itemRand();
else if( itemScan(&v) == EOF )
break;
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
return 0;
}
So that would become

int main(int argc, char* argv[])
{
int n, maxn;
Key v;
Item item;

if(argc < 3)
{
/* no command line args given */
fprintf(stderr, "Insufficient arguments, using random values\n");
maxn = random() * WHATEVER_MAXN/MAX_RAND;
sw = random() * WHATEVER_SW/MAX_RAND;
}
else
{
/* command line args given */
maxn = atoi(argv[1]);
sw = atoi(argv[2]);
}

/* at this point maxn and sw ae defined either way */
for( n = 0; n < maxn; n++ ) {
if( sw )
v = itemRand();
else if( itemScan(&v) == EOF )
break;
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
return 0;
}


i'm pretty sure you're not expected to write a new random generation
part. the random generation part comes from this line i think: 'v =
itemRand();'. that function certainly creates a random 'item' which is
in this case is just an int.

i know there are bits of code missing from the code in the book, but i
don't think there's quite such main chunks missing like that -- i could
be wrong though

thing is, if new but unecessary bits are bolted on, when in actual fact
there's already stuff there that does that but just isn't quite working
for whatever reason, i'm going to end up in even more of a mess with
this than i am now.

thanks very much for your answer but i don't think new random parts
should be written. do you not think it's the 'v = itemRand();' line?

thanks very much, ben
 
D

dandelion

ben said:
dandelion said:
#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

int main(int argc, char *argv[])
{
if(argc < 3) {
fprintf(stderr, "Insufficient arguments\n"); exit(1); }
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
Key v;
Item item;
stInit(maxn);
for( n = 0; n < maxn; n++ ) {
if( sw )
v = itemRand();
else if( itemScan(&v) == EOF )
break;
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
return 0;
}
So that would become

int main(int argc, char* argv[])
{
int n, maxn;
Key v;
Item item;

if(argc < 3)
{
/* no command line args given */
fprintf(stderr, "Insufficient arguments, using random values\n");
maxn = random() * WHATEVER_MAXN/MAX_RAND;
sw = random() * WHATEVER_SW/MAX_RAND;
}
else
{
/* command line args given */
maxn = atoi(argv[1]);
sw = atoi(argv[2]);
}

/* at this point maxn and sw ae defined either way */
for( n = 0; n < maxn; n++ ) {
if( sw )
v = itemRand();
else if( itemScan(&v) == EOF )
break;
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
return 0;
}


i'm pretty sure you're not expected to write a new random generation
part. the random generation part comes from this line i think: 'v =
itemRand();'. that function certainly creates a random 'item' which is
in this case is just an int.

Right, i guess. Eventhough I have not seen the definition of itemRand(). But
I do not replace it. The code I supplied merely replaced the *missing*
arguments with random ones.
i know there are bits of code missing from the code in the book, but i
don't think there's quite such main chunks missing like that -- i could
be wrong though

The code you presented originally only works if there are two command-line
arguments. Without these arguments, the program would not work. Your
question, unless I misunderstood, was:
"How to make it work". I changed your code a bit, but that was certainly no
"major brain surgury".

thing is, if new but unecessary bits are bolted on, when in actual fact
there's already stuff there that does that but just isn't quite working
for whatever reason, i'm going to end up in even more of a mess with
this than i am now.

Sorry to hear that. Take the code I wrote as a suggestion. It's certainly
not intended
as more than that.
thanks very much for your answer but i don't think new random parts
should be written. do you not think it's the 'v = itemRand();' line?

That supplies the random content. It does not supply any defaults for the
case no command line arguments are present. Besides, I left *that* line
untouched. if you worry about

<quote meself>
maxn = random() * WHATEVER_MAXN/RAND_MAX;
sw = random() * WHATEVER_SW/RAND_MAX;
</quote>

You can just replace "random() * blablabla..." by any integer constant you
like and get more predictable behavior.
thanks very much, ben

My pleasure.

PS. MAX_RAND (in previous post) should read RAND_MAX (errare humanum est).
 
K

Keith Thompson

ben said:
#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

int main(int argc, char *argv[])
{
if(argc < 3)
fprintf(stderr, "Insufficient arguments\n"), exit(1);
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
[...]

You've done something very odd here, probably unintentionally. As
dandelion mentioned, statements are separated (actually terminated) by
semicolons, not commas, and if you want multiple statements to be
controlled by an if(), you need to enclose them in curly braces.
(Personally, I use curly braces even if there's a single statement,
but that's not mandatory.)

What you've done here is use the comma operator to combine the
fprintf() call and the exit() call into a single expression, which
then becomes a single statement when you add the final semicolon.
There are rules about the value yielded by an expression with a comma
operator, but they don't matter here because you're not using the
result.

If you don't understand that, don't worry about it; it suffices to say
that you've accidentally done something that happens to work, but it
would be horribly bad style if you did it deliberately. (You can't
depend on the compiler to find all your errors.)

What you want is:

if (argc < 3) {
fprintf(stderr, "Insufficient arguments\n");
exit(EXIT_FAILURE);
}

exit(EXIT_FAILURE) is more portable than exit(1).

Also, within a block (the stuff enclsed in '{' and '}') you normalliy
need to have all the declarations first, followed by all the
statements. C++ and C99 allow you to mix them, and some pre-C99
compilers might allow it as an extension, but it's a bad idea to
depend on it. You can either move the statements (and the variable
initializations) down after the declarations:

int main(int argc, char *argv[])
{
int n, maxn, sw;

if(argc < 3)
fprintf(stderr, "Insufficient arguments\n");
exit(1);
}

maxn = atoi(argv[1]);
sw = atoi(argv[2]);
...
}

or introduce a new block:

int main(int argc, char *argv[])
{
if(argc < 3)
fprintf(stderr, "Insufficient arguments\n");
exit(1);
}
{
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
...
}
...
}
 
J

josh

ben said:
so, all that in short, how do you think this programme is supposed to
be called (to supply data, and make it randomly generate it)?

Guessing from function names...
#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

int main(int argc, char *argv[])
{
if(argc < 3)
fprintf(stderr, "Insufficient arguments\n"), exit(1);
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);

maxn is the first argument,
sw is the second
Key v;
Item item;
stInit(maxn);
for( n = 0; n < maxn; n++ ) {

The first argument is the maximum number of numbers that will be processed.
if( sw )
v = itemRand();

If the second command-line argument is non-zero, it will use a random
number for v. (note that anything that isn't a number becomes zero,
because of atoi)
else if( itemScan(&v) == EOF )

Otherwise it pulls a number from standard input. If the input stream
ends before the maximum number of numbers has been read,

it stops.
if( stSearch(v) != NULL )
continue;
key(item) = v;
stInsert(item);
}
stSort(itemShow);
printf("\n%d keys %d distinct keys\n", n, stCount() );
return 0;
}

-josh
 
B

ben

The code you presented originally only works if there are two command-line
arguments. Without these arguments, the program would not work. Your
question, unless I misunderstood, was:
"How to make it work". I changed your code a bit, but that was certainly no
"major brain surgury".

yes, absolutely -- thanks. i'd missed out info. sorry. i'm just trying
to ascertain how the example code is intended to be.

the book gives example code and it has an interchangeable nature to it
(using abstract data types) -- certain parts stay the same for an array
of examples so i wanted to change it as little as possible to not break
and use that intended mechanism.

thanks for help. think i've got it now nearly.

thanks, ben.
 
B

ben

josh said:
ben said:
so, all that in short, how do you think this programme is supposed to
be called (to supply data, and make it randomly generate it)?

Guessing from function names...
#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

int main(int argc, char *argv[])
{
if(argc < 3)
fprintf(stderr, "Insufficient arguments\n"), exit(1);
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);

maxn is the first argument,
sw is the second
Key v;
Item item;
stInit(maxn);
for( n = 0; n < maxn; n++ ) {

The first argument is the maximum number of numbers that will be processed.
if( sw )
v = itemRand();

If the second command-line argument is non-zero, it will use a random
number for v. (note that anything that isn't a number becomes zero,
because of atoi)
else if( itemScan(&v) == EOF )

Otherwise it pulls a number from standard input. If the input stream
ends before the maximum number of numbers has been read,


yes i've nearly sussed this now -- i made a very stupid mistake -- for
some reason i got mixed up between feeding the programme with data via
arguments from the command line, and standard input :/ so in actual
fact the input via standard input (and the programme's output) is
working now in that respect. the random aspect isn't working yet but i
think i can see what's wrong -- it's just a silly little thing i think.

phew! i don't know why that caused me such a problem -- the idea of
having to second guess / fill in made it seem much harder than
orignating it from scratch almost.

thanks, ben.
 
B

ben

Keith Thompson said:
ben said:
#include <stdio.h>
#include <stdlib.h>
#include "item.h"
#include "st.h"

int main(int argc, char *argv[])
{
if(argc < 3)
fprintf(stderr, "Insufficient arguments\n"), exit(1);
int n, maxn = atoi(argv[1]), sw = atoi(argv[2]);
[...]

You've done something very odd here, probably unintentionally. As
dandelion mentioned, statements are separated (actually terminated) by
semicolons, not commas, and if you want multiple statements to be
controlled by an if(), you need to enclose them in curly braces.
(Personally, I use curly braces even if there's a single statement,
but that's not mandatory.)

What you've done here is use the comma operator to combine the
fprintf() call and the exit() call into a single expression, which
then becomes a single statement when you add the final semicolon.
There are rules about the value yielded by an expression with a comma
operator, but they don't matter here because you're not using the
result.

If you don't understand that, don't worry about it; it suffices to say
that you've accidentally done something that happens to work, but it
would be horribly bad style if you did it deliberately. (You can't
depend on the compiler to find all your errors.)

it may or may not be a good idea, but it's not an error though i don't
think. from the the k&r book (page 62-63):

comma operators should be used sparingly. the most suitable uses are
for constructs strongly related to each other, as in the 'for' loop
..... and in macros where a multistep computation has to be a single
expression. a comma expression might also be appropriate for the
exchange of elements in 'reverse', where the exchange can be thought of
as a single operation:

for( i = 0, j = strlen(s)-1; i < j; i++, j-- )
c = s, s = s[j], s[j] = c;


thanks, ben.
 
C

Charlie Gordon

comma operators should be used sparingly. the most suitable uses are
for constructs strongly related to each other, as in the 'for' loop
.... and in macros where a multistep computation has to be a single
expression. a comma expression might also be appropriate for the
exchange of elements in 'reverse', where the exchange can be thought of
as a single operation:

for( i = 0, j = strlen(s)-1; i < j; i++, j-- )
c = s, s = s[j], s[j] = c;


This code works only for i and j signed
If you define i or j as size_t, it will crash for s = ""
Horrible style by all counts !

Chqrlie
 
K

Keith Thompson

ben said:
it may or may not be a good idea, but it's not an error though i don't
think. from the the k&r book (page 62-63):
[snip]

Perhaps "errors" was a poor choice of words on my part.

It's not an "error" in the sense of something that a compiler is
expected to complain about. It is an error in the more generic sense
of being a bad idea.

Stringing expressions together with comma operators in a context were
it would make sense to make them distinct expression statements
surrounded by curly braces is a bad idea.
for( i = 0, j = strlen(s)-1; i < j; i++, j-- )
c = s, s = s[j], s[j] = c;


That's equivalent to:

for( i = 0, j = strlen(s)-1; i < j; i++, j-- ) {
c = s;
s = s[j];
s[j] = c;
}

Or, if you insist on putting the assignments on one line:

for( i = 0, j = strlen(s)-1; i < j; i++, j-- ) {
c = s; s = s[j]; s[j] = c;
}
 
M

Michael Wojcik

dandelion said:
Check argc instead, which will give you the number of items in argv[].
argv[0] (by convention?POSIX) should give you the command that started your
program.

C requires that if argc > 0, argv[0] "represents the program name"
(C90 5.1.2.3). However, this actually constitutes the standard's
definition of "the program name" - the program's name is whatever's
in argv[0] - so it has little normative force, except to describe a
convention and imply a quality-of-implementation issue.

POSIX makes no promises about the contents of argv[0], and in fact
provides the means for it to be anything the program's environment
(specifically the parent process, though that's outside the scope of
C) wants.

argv[0] can also be NULL for a program running on a POSIX-compliant
implementation.

Various POSIX-specified programs run other programs, and when they do
argv[0] should be the pathname of the executable. And POSIX implemen-
tations should set argv[0] to the pathname of the executable when a
program is started with the standard C mechanism (system). But aside
from that, at the POSIX API level it's just a convention. Real C
programs running on POSIX systems shouldn't assume that argv[0] is
non-NULL or otherwise reliable; that's a bug (and for privileged
processes a potential security hole).
 
D

dandelion

Michael Wojcik said:
dandelion said:
Check argc instead, which will give you the number of items in argv[].
argv[0] (by convention?POSIX) should give you the command that started your
program.

C requires that if argc > 0, argv[0] "represents the program name"
(C90 5.1.2.3). However, this actually constitutes the standard's
definition of "the program name" - the program's name is whatever's
in argv[0] - so it has little normative force, except to describe a
convention and imply a quality-of-implementation issue.

POSIX makes no promises about the contents of argv[0], and in fact
provides the means for it to be anything the program's environment
(specifically the parent process, though that's outside the scope of
C) wants.

argv[0] can also be NULL for a program running on a POSIX-compliant
implementation.

Various POSIX-specified programs run other programs, and when they do
argv[0] should be the pathname of the executable. And POSIX implemen-
tations should set argv[0] to the pathname of the executable when a
program is started with the standard C mechanism (system). But aside
from that, at the POSIX API level it's just a convention. Real C
programs running on POSIX systems shouldn't assume that argv[0] is
non-NULL or otherwise reliable; that's a bug (and for privileged
processes a potential security hole).

Agreed.

However, I made no claims about anything being guaranteed and as you
indicate yourself, it should (in the everyday sense of the word) be there,
whether or not it is informative, is not exactly the point. The point was
merely to indicate why command line arguments start at argv[1] (C arrays
being zero-based and all).

regards,

dandelion.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,787
Messages
2,569,631
Members
45,338
Latest member
41Pearline46

Latest Threads

Top