Request for source code review

U

Udyant Wig

I was following Steve Summit's C programming notes at
http://www.eskimo.com/~scs/cclass/notes/top.html
and got inspired by the dice-rolling project. I have written a naive
Gaussian distribution program that simulates throwing a specific number
of n-sided dice a given number of times. The output is the expected
bell curve along with the numerical value of each sum's occurences.

For example, the output for 10000 rolls of three six-sided dice is:

3: 51
4: 148
5: 260
6: 454
7: 686
8: 969
9: 1146
10: 1266
11: 1239
12: 1149
13: 1013
14: 713
15: 433
16: 280
17: 152
18: 41

3:
4: =
5: ==
6: ====
7: ======
8: =========
9: ===========
10: ============
11: ============
12: ===========
13: ==========
14: =======
15: ====
16: ==
17: =
18:


Being a novice, I request the readers of this newsgroup to look over the
source to suggest improvements regarding readability, layout, style,
correctness, idioms, etc. I would appreciate it very much.

I have placed the source files on Pastebin at this URL
http://pastebin.com/1PTVJrTN

In case anyone needs the Makefile, this is what I used:


CC = clang
FLAGS = -Wall -Wextra -std=c89 -pedantic

gauss: gauss.o main.o utils.o
$(CC) $(FLAGS) -o gauss gauss.o main.o utils.o
gauss.o: gauss.c
$(CC) $(FLAGS) -c gauss.c
main.o: main.c
$(CC) $(FLAGS) -c main.c
utils.o: utils.c
$(CC) $(FLAGS) -c utils.c

clean:
rm *.o gauss


Thank you.
 
J

Jorgen Grahn

I was following Steve Summit's C programming notes at
http://www.eskimo.com/~scs/cclass/notes/top.html
and got inspired by the dice-rolling project. I have written a naive
Gaussian distribution program that simulates throwing a specific number ....
I have placed the source files on Pastebin at this URL
http://pastebin.com/1PTVJrTN

In case anyone needs the Makefile, this is what I used: ....
FLAGS = -Wall -Wextra -std=c89 -pedantic ....
utils.o: utils.c
$(CC) $(FLAGS) -c utils.c

Off-topic hint: if you call the flags CFLAGS instead of FLAGS, you can
skip the object file targets. Gnu Make has builtin rules for such
things. See the manual.

/Jorgen
 
B

Ben Bacarisse

Udyant Wig said:
I was following Steve Summit's C programming notes at
http://www.eskimo.com/~scs/cclass/notes/top.html
and got inspired by the dice-rolling project. I have written a naive
Gaussian distribution program that simulates throwing a specific number
of n-sided dice a given number of times. The output is the expected
bell curve along with the numerical value of each sum's occurences.
Being a novice, I request the readers of this newsgroup to look over the
source to suggest improvements regarding readability, layout, style,
correctness, idioms, etc. I would appreciate it very much.

I have placed the source files on Pastebin at this URL
http://pastebin.com/1PTVJrTN

I think it's good.

By the way, it's easier to comment on posted code though I know you were
probably worried about posting a long listing here. It is longish, but
I think it's probably within the length where people are more likely to
read it when posted directly than when it's referenced on a sharing site.

Anyway, I have only a few of remarks:

* Why have the function 'd' in file on its own? If it's going to be used
a lot other programs, it needs a much better name than 'd', and if it
isn't, why not just put it where it is used? (Either way, 'd' is not
really a good name for it -- too sort whatever the use).

* It's widely regarded a good style to put parentheses round all
macro bodies that are anything more the a constant or an identifier.
One day you'll forget that you wrote

#define LIMIT (MAXSUM) + 1

and you'll write LIMIT * 2.

* I'd just initialise my array where it is declared:

int sums[LIMIT] = {0};

rather than have an initsums function.

* Finally, I prefer to avoid having extra variables. Rather than:

int sum; /* Sum of the dice. */
int i;

for (i = 0; i < maxrolls; i++) {
sum = sumdice (NDICE);
++sums [sum];
}

I'd write:

for (i = 0; i < maxrolls; i++)
++sums[ sumdice(NDICE) ];

For one thing, after a while, you run out of good names for them.
That's already happening here: 'sumdice' is a better name than 'sum'
i.e. the function name says more than the extra variable does.

Anyway, it's good. Don't let my nit-picking get you down.
 
K

Keith Thompson

Udyant Wig said:
I have placed the source files on Pastebin at this URL
http://pastebin.com/1PTVJrTN

In case anyone needs the Makefile, this is what I used:


CC = clang
FLAGS = -Wall -Wextra -std=c89 -pedantic

gauss: gauss.o main.o utils.o
$(CC) $(FLAGS) -o gauss gauss.o main.o utils.o
gauss.o: gauss.c
$(CC) $(FLAGS) -c gauss.c
main.o: main.c
$(CC) $(FLAGS) -c main.c
utils.o: utils.c
$(CC) $(FLAGS) -c utils.c

clean:
rm *.o gauss

This isn't directly a C issue, but your Makefile doesn't reflect any
dependencies on the *.h files. For example, if you edit gauss.h and
type "make", it won't rebuild anything.

And you might as well use "-std=c99", or even "-std=c11" if you have a
new enough version of clang.
 
U

Udyant Wig

| I think it's good.
Thank you.

| By the way, it's easier to comment on posted code though I know you were
| probably worried about posting a long listing here. It is longish, but
| I think it's probably within the length where people are more likely to
| read it when posted directly than when it's referenced on a sharing site.
I had originally wanted to post the entire source here, but I had no
idea about the acceptable source post length.

I will keep your advice in mind, though.

| Anyway, I have only a few of remarks:
| * Why have the function 'd' in file on its own? If it's going to be used
| a lot other programs, it needs a much better name than 'd', and if it
| isn't, why not just put it where it is used? (Either way, 'd' is not
| really a good name for it -- too sort whatever the use).
Makes sense.

| * It's widely regarded a good style to put parentheses round all
| macro bodies that are anything more the a constant or an identifier.
| One day you'll forget that you wrote
| #define LIMIT (MAXSUM) + 1
|
| and you'll write LIMIT * 2.
I can see now how my current scheme could lead to nasty macroexpansion
bugs when used carelessly.

| * I'd just initialise my array where it is declared:
|
| int sums[LIMIT] = {0};
|
| rather than have an initsums function.
I had forgotten about zero-initialization for arrays. That function of
mine seems highly redundant now.

| * Finally, I prefer to avoid having extra variables. Rather than:
|
| int sum; /* Sum of the dice. */
| int i;
|
| for (i = 0; i < maxrolls; i++) {
| sum = sumdice (NDICE);
| ++sums [sum];
| }
|
| I'd write:
|
| for (i = 0; i < maxrolls; i++)
| ++sums[ sumdice(NDICE) ];
|
| For one thing, after a while, you run out of good names for them.
| That's already happening here: 'sumdice' is a better name than 'sum'
| i.e. the function name says more than the extra variable does.
Sounds good.

| Anyway, it's good. Don't let my nit-picking get you down.
I think your remarks have made the code clearer to me.
 
U

Udyant Wig

| This isn't directly a C issue, but your Makefile doesn't reflect any
| dependencies on the *.h files. For example, if you edit gauss.h and
| type "make", it won't rebuild anything.
I will have to learn to do that, then.

| And you might as well use "-std=c99", or even "-std=c11" if you have a
| new enough version of clang.
I have been focusing on ANSI C90, the language as described in K&R 2nd
ed. Sadly, there won't be new editions of K&R to expound the later C
standards.

I understand that some of the useful new features of C99 include:
* variable length arrays
* a Boolean type
* C++ comments
* `for' loop variable declaration
* struct initialization
* the long long type
* complex numbers

Does there exist learning material for C99 as there does for C90?
Or would I be better off learning C99 as a `patch' for C90?
 
I

Ian Collins

This isn't directly a C issue, but your Makefile doesn't reflect any
dependencies on the *.h files. For example, if you edit gauss.h and
type "make", it won't rebuild anything.

A decent make should provide a means for automating this. Otherwise
where do you stop recursing up the include paths?
 
J

James Kuyper

On 08/30/12 01:55 PM, Keith Thompson wrote: ....

A decent make should provide a means for automating this. Otherwise
where do you stop recursing up the include paths?

I've used a "decent" make that was part of a configuration management
system called ClearCase, which fully automated the tracking of
dependencies, but by rather drastic means. Whenever clearmake executed
the build script for a given target, it would monitor the processes that
were executed, keeping track of ALL files that were opened during the
build. If they were versioned files managed by ClearCase, it identified
which version of the file was currently visible; for other files it just
looked at the file date. The next time that the same target was built,
it would check to see whether any of the files that had been opened the
last time had been changed, and if so, it assumed that the build script
had to be re-executed.

One of the key things that made this work was that the make facility was
tightly integrated with a special device driver that managed access to
disk space; only the use of files stored on disk volumes managed by that
driver could be detected.

After paying the rather hefty license fees for such sophisticated
software, guess how it was used: whenever we delivered new code, CM
would build it using the option that forced a complete re-build,
regardless of dependencies.
 
M

Malcolm McLean

בת×ריך ×™×•× ×¨×‘×™×¢×™, 29 ב×וגוסט 2012 20:19:06 UTC+1, מ×ת Udyant Wig:
I was following Steve Summit's C programming notes at

Being a novice, I request the readers of this newsgroup to look over the
source to suggest improvements regarding readability, layout, style,
correctness, idioms, etc. I would appreciate it very much.
It's a lot better than most newbie code. But you need to think a bit
more about what makes a good function.

d() is a good function, but a bad nmae. It rolls a ide and returns the
result. It's intuitive, it's resuable, it represents a natural module.

However some of the printing functions seem to exist purely to get code out
of main(). When main() becomes so long that it's unreadable then it's
necessary to break it down. But when the program is quite simple, it's better
not to hide things in functions.


This function for example is really just a wrapper for a printing loop

/* Print the number of occurrences of the various sums. */


void printsums (int sums [], int limit) {

int i;


for (i = NDICE; i < limit; i++)
printf ("%2d: %d\n", i, sums );

}


then it's got another problem with it, which is that it's taking one
limit as an argument and the other as a compile time constant. So you're
creating confusion. Don't be afraid to pass a compile time constant as
a parameter. Often that makes functions more understandable and more
general.


histsums() however is more substantial, and drawing a histogram is a natural
unit of work. So the case for this being in a function is a lot stronger.
You could also write a top-level fucntion printoutput(), if you want a tidy
main().
 
B

Ben Bacarisse

Udyant Wig said:
| This isn't directly a C issue, but your Makefile doesn't reflect any
| dependencies on the *.h files. For example, if you edit gauss.h and
| type "make", it won't rebuild anything.
I will have to learn to do that, then.

| And you might as well use "-std=c99", or even "-std=c11" if you have a
| new enough version of clang.
I have been focusing on ANSI C90, the language as described in K&R 2nd
ed. Sadly, there won't be new editions of K&R to expound the later C
standards.

I understand that some of the useful new features of C99 include:
* variable length arrays
* a Boolean type
* C++ comments
* `for' loop variable declaration
* struct initialization

They are called "compound literals". "Old" C has struct initialisation,
so that's not a good term for it. What C99 adds is the ability to
"denote" an unnamed struct object:

plot((struct point){3, 4});
* the long long type
* complex numbers

Does there exist learning material for C99 as there does for C90?
Or would I be better off learning C99 as a `patch' for C90?

I'd do it that way. If you find K&R's style suits you, there is
really no better way to start and C99's extras are small.

VLAs are the most involved, I think, and they have been made optional in
C11. That's a crying shame to my mind, but that's how things have
panned out. No one who's implemented VLAs will remove the support, but
it makes them a second-class feature of the language.
 
U

Udyant Wig

| Off-topic hint: if you call the flags CFLAGS instead of FLAGS, you can
| skip the object file targets. Gnu Make has builtin rules for such
| things. See the manual.
That worked.

I will have to study the manual at some point.

| /Jorgen
 
U

Udyant Wig

| It's a lot better than most newbie code. But you need to think a bit
| more about what makes a good function.
|
| d() is a good function, but a bad nmae. It rolls a ide and returns the
| result. It's intuitive, it's resuable, it represents a natural module.
I can think of:
* die
* roll_die
* n_sided_die

| However some of the printing functions seem to exist purely to get code out
| of main(). When main() becomes so long that it's unreadable then it's
| necessary to break it down. But when the program is quite simple, it's better
| not to hide things in functions.
|
| This function for example is really just a wrapper for a printing loop
|
|
| /* Print the number of occurrences of the various sums. */
|
| void printsums (int sums [], int limit) {
|
| int i;
|
| for (i = NDICE; i < limit; i++)
| printf ("%2d: %d\n", i, sums );
| }
|
|
| then it's got another problem with it, which is that it's taking one
| limit as an argument and the other as a compile time constant. So you're
| creating confusion. Don't be afraid to pass a compile time constant as
| a parameter. Often that makes functions more understandable and more
| general.
I have used this convention in two other functions (histogram &
histsums). I should redesign them too.

| histsums() however is more substantial, and drawing a histogram is a natural
| unit of work. So the case for this being in a function is a lot stronger.
| You could also write a top-level fucntion printoutput(), if you want a tidy
| main().
My aim was to make clear the flow of the program in main. Anything
that did some specific work, I moved to its own function.

But, I get your point.

Incidentally, when faced with an unfamiliar codebase, where do you
begin reading?
 
B

Ben Bacarisse

Udyant Wig said:
| It's a lot better than most newbie code. But you need to think a bit
| more about what makes a good function.
|
| d() is a good function, but a bad nmae. It rolls a ide and returns the
| result. It's intuitive, it's resuable, it represents a natural module.
I can think of:
* die
* roll_die
* n_sided_die

It's not really anything about dice. It picks a number in [1,n] with
uniform probability. I'd probably go with something like uniform_int_in
and then make the lower bound a parameter too:

int uniform_int_in(int min, int max);
| However some of the printing functions seem to exist purely to get code out
| of main(). When main() becomes so long that it's unreadable then it's
| necessary to break it down. But when the program is quite simple, it's better
| not to hide things in functions.
|
| This function for example is really just a wrapper for a printing loop
|
|
| /* Print the number of occurrences of the various sums. */
|
| void printsums (int sums [], int limit) {
|
| int i;
|
| for (i = NDICE; i < limit; i++)
| printf ("%2d: %d\n", i, sums );
| }
|
|
| then it's got another problem with it, which is that it's taking one
| limit as an argument and the other as a compile time constant. So you're
| creating confusion. Don't be afraid to pass a compile time constant as
| a parameter. Often that makes functions more understandable and more
| general.
I have used this convention in two other functions (histogram &
histsums). I should redesign them too.

| histsums() however is more substantial, and drawing a histogram is a natural
| unit of work. So the case for this being in a function is a lot stronger.
| You could also write a top-level fucntion printoutput(), if you want a tidy
| main().
My aim was to make clear the flow of the program in main. Anything
that did some specific work, I moved to its own function.


Personally, I prefer too see too many functions rather than too few.
Over time, laziness (if you at all like me) will cause you to get
a better balance. It's much harder to get into the habit of hiving off
code into functions than it is to get out of it.
But, I get your point.

Incidentally, when faced with an unfamiliar codebase, where do you
begin reading?

If you are lucky, the READ_ME file or the huge explanatory comment at
the top of main.c! Failing that, I start with main() to see if I can
get the fell of what's happening, but I will probably jump around other
modules too just to get familiar with what is where.
 
I

Ian Collins

I've used a "decent" make that was part of a configuration management
system called ClearCase, which fully automated the tracking of
dependencies, but by rather drastic means. Whenever clearmake executed
the build script for a given target, it would monitor the processes that
were executed, keeping track of ALL files that were opened during the
build. If they were versioned files managed by ClearCase, it identified
which version of the file was currently visible; for other files it just
looked at the file date. The next time that the same target was built,
it would check to see whether any of the files that had been opened the
last time had been changed, and if so, it assumed that the build script
had to be re-executed.

One of the key things that made this work was that the make facility was
tightly integrated with a special device driver that managed access to
disk space; only the use of files stored on disk volumes managed by that
driver could be detected.

I've had hours of fun with clearmake...

Sun's (d)make uses the output of the preprocessor phase of compiles to
collate dependency information. Rather more elegant and a lot cheaper!
 
U

Udyant Wig

I thank all readers who responded to my request.

I have revised the source taking into account many of the remarks made.
Follows the new listing.

/* utils.h begins */

#include <stdlib.h>

extern int uniform_int_in (int min, int max);

/* utils.h ends */



/* utils.c begins */

#include "utils.h"

/* Pick a number in [1, n] with uniform probability. */
int
uniform_int_in (int min, int max) {
return min + (rand () / (RAND_MAX / max + 1));
}

/* utils.c ends */



/* gauss.h begins */

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#include "utils.h"


#define NDICE 5 /* The number of dice to be rolled. */
#define SIDES 6 /* Each die has these many sides. */
#define MAXSUM (SIDES * NDICE) /* The maximum possible sum of NDICE
dice.
*/
#define LIMIT ((MAXSUM) + 1) /* This constant is used:
0) to declare the array of sums
1) as a loop limit
*/
#define ROLLS 100000 /* The number of throws of the dice */
#define SCALE 100 /* Used to fit the histogram into the
screen width. A lower value
implies longer bars.
*/

extern int roll_die (int n);
extern int sumdice (int ndice);
extern void fillsums (int sums [], int maxrolls);
extern void printsums (int sums [], int begin_limit, int end_limit);
extern void histogram (int sum, int n, int scale);
extern void histsums (int sums [], int begin_limit, int end_limit);

/* gauss.h ends */



/* gauss.c begins */

#include "gauss.h"

/* Given n, throw an n-sided die. */
int
roll_die (int n) {
return uniform_int_in (1, n);
}


/* Roll the given number of dice and return their sum. */
int
sumdice (int ndice) {
int i;
int sum = 0;

for (i = 0; i < ndice; i++)
sum += roll_die (SIDES);

return sum;
}

/* For the specified number of dice throws: calculate the sum of the
given number of dice, and increment the array slot whose index is
the sum.
*/
void
fillsums (int sums [], int maxrolls) {
int i;

for (i = 0; i < maxrolls; i++)
++sums [sumdice (NDICE)];
}

/* Print the number of occurrences of the various sums. */
void
printsums (int sums [], int begin_limit, int end_limit) {
int i;

for (i = begin_limit; i < end_limit; i++)
printf ("%2d: %d\n", i, sums );
}

/* Print a horizontal bar histogram for a given sum. */
void
histogram (int sum, int n, int scale) {
int i;

printf ("%2d: ", sum);
for (i = 0; i < n / scale; i++)
putchar ('=');
putchar ('\n');
}

/* Print a histogram of the occurrences. */
void
histsums (int sums [], int begin_limit, int end_limit) {
int i;

if ((NDICE > 0) && (SIDES > 1)) {
for (i = begin_limit; i < end_limit; i++)
histogram (i, sums , SCALE);
}
else {
if (NDICE < 1) {
fputs ("NDICE set to insane value\n", stderr);
fputs ("So, no histogram.\n", stderr);
}
if (SIDES < 2) {
fputs ("SIDES set to insane value\n", stderr);
fputs ("So, no histogram.\n", stderr);
}
}
}

/* gauss.c ends */



/* main.c begins */

#include "gauss.h"

int
main (void) {
int sums [LIMIT] = {0}; /* We use only indices
NDICE to LIMIT - 1
for indexing the sums of
the given of dice. We waste
sizeof (int) * NDICE
bytes.
*/

srand (time (NULL)); /* Seed the RNG using the
current time.
*/
fillsums (sums, ROLLS); /* Count the sums. */
printsums (sums, NDICE, LIMIT); /* Print how many of each sum
counted.
*/
putchar ('\n');
histsums (sums, NDICE, LIMIT); /* Print a histogram of
the same.
*/

exit (EXIT_SUCCESS); /* Done. So, exit. */
}

/* main.c ends */
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,040
Latest member
papereejit

Latest Threads

Top