Resources about program design in C

A

Albert

Hi,

I've recently been *fascinated* about breaking C programs into
functions and what characteristics make up a good function. K&R 2nd
Ed. have a somewhat limited number of programs also briefly written in
pseudocode and Steve Summit's site has a few notes about the role of
functions. While I'm mostly hooked onto function design, I've also got
an interest in the reasoning behind global variables in C programs.
Can anyone point me to any other resources that have information about
the above interests?

TIA
Albert
 
V

vippstar

Hi,

I've recently been *fascinated* about breaking C programs into
functions and what characteristics make up a good function. K&R 2nd
Ed. have a somewhat limited number of programs also briefly written in
pseudocode and Steve Summit's site has a few notes about the role of
functions. While I'm mostly hooked onto function design, I've also got
an interest in the reasoning behind global variables in C programs.
Can anyone point me to any other resources that have information about
the above interests?

Read HtDP and SICP, great books although they do not use C.

<http://www.htdp.org/>
<http://mitpress.mit.edu/sicp/>
 
A

Antoninus Twink

Read HtDP and SICP, great books although they do not use C.

Yeah great, except that the OP explicitly said he's interested in C, not
Lisp or Scheme.
 
G

Guest

I've recently been *fascinated* about breaking C programs into
functions and what characteristics make up a good function.

a function should do one thing well

K&R 2nd
Ed. have a somewhat limited number of programs also briefly written in
pseudocode and Steve Summit's site has a few notes about the role of
functions. While I'm mostly hooked onto function design, I've also got
an interest in the reasoning behind global variables in C programs.

don't use 'em! Or rather use as few as possible. With small filter
(text processing) programs I may have a few global flags like
"verbose" or "debug". In larger programs some sort of error logger
may be all that's really global.
Can anyone point me to any other resources that have information about
the above interests?

program design is largely language independent. And this sort
of low level design was neglected for a while. You could try
"Refactoring" by Fowler but it's rather OO oriented. Try to
read a library copy before you buy it.

The refactoring people aim for many, small well written
"modules" (functions, classes whatever)

If you can find old books

I liked this at the time, but it's regarded as dated now
(I don't entirely agree)
"Structured Design"
www.amazon.com/exec/obidos/ISBN=0138544719/theinternationscA/

still a great book (but you have to put up with
FORTRAN and PL/I)
"Elements of Programming Style"
 
C

Chris McDonald

a function should do one thing well
......
don't use 'em! Or rather use as few as possible. With small filter
(text processing) programs I may have a few global flags like
"verbose" or "debug". In larger programs some sort of error logger
may be all that's really global.


Agreed; all beautifully described in chapters of:

The Art of Unix Programming
Eric Steven Raymond
http://www.faqs.org/docs/artu/index.html

still a great book (but you have to put up with
FORTRAN and PL/I)
"Elements of Programming Style"

and
The Practice of Programming
by Brian W. Kernighan and Rob Pike.
http://cm.bell-labs.com/cm/cs/tpop/
 
K

Keith Thompson

George Peter Staplin said:
I agree with most of what is written here:
http://www.lysator.liu.se/c/pikestyle.html

You can certainly do worse than following that.
[...]

Consistency of style is more important than (almost) any specific
style guideline. The only thing worse than a style I dislike is a
mixture of different styles.

But I disagree fairly strongly with this rule:

Simple rule: include files should never include include files.
If instead they state (in comments or implicitly) what files
they need to have included first, the problem of deciding
which files to include is pushed to the user (programmer)
but in a way that's easy to handle and that, by construction,
avoids multiple inclusions. Multiple inclusions are a bane
of systems programming. It's not rare to have files included
five or more times to compile a single C source file. The Unix
/usr/include/sys stuff is terrible this way.

There's a little dance involving #ifdef's that can prevent a file
being read twice, but it's usually done wrong in practice - the
#ifdef's are in the file itself, not the file that includes it.
The result is often thousands of needless lines of code passing
through the lexical analyzer, which is (in good compilers)
the most expensive phase.

Just follow the simple rule.

IMHO, an include file should define an interface (possibly implemented
partly via the use of other include files), and a single #include
directive should be sufficient to use that interface.

Suppose "foo.h" depends on "bar.h" today, but a new version depends
instead on "blorg.h", for reasons irrelevant to client code. If foo.h
just includes the headers it needs, I don't need to update my source
code when it changes. As for the "little dance" of #ifdef's, I've
found that it's almost always done *correctly* (except that the macro
names often intrude on the implementation's namespace). And I'm not
at all convinced that the time spent re-scanning headers is a big
problem (I'm sure someone has measured this, but I don't have any
numbers). I seem to recall that some compilers can recognize the
common pattern and can avoid re-scanning the entire file (though this
could theoretically make such a compiler non-conforming in the
presence of some really ugly code).

Having said that, if I worked on a project that imposed this as a
coding standard, I'd certainly follow it.
 
B

Ben Bacarisse

Keith Thompson said:
George Peter Staplin said:
I agree with most of what is written here:
http://www.lysator.liu.se/c/pikestyle.html

You can certainly do worse than following that.
[...]

Consistency of style is more important than (almost) any specific
style guideline. The only thing worse than a style I dislike is a
mixture of different styles.

But I disagree fairly strongly with this rule:

Simple rule: include files should never include include files.
There's a little dance involving #ifdef's that can prevent a file
being read twice, but it's usually done wrong in practice - the
#ifdef's are in the file itself, not the file that includes it.
The result is often thousands of needless lines of code passing
through the lexical analyzer, which is (in good compilers)
the most expensive phase.

Just follow the simple rule.

IMHO, an include file should define an interface (possibly implemented
partly via the use of other include files), and a single #include
directive should be sufficient to use that interface.

Suppose "foo.h" depends on "bar.h" today, but a new version depends
instead on "blorg.h", for reasons irrelevant to client code. If foo.h
just includes the headers it needs, I don't need to update my source
code when it changes. As for the "little dance" of #ifdef's, I've
found that it's almost always done *correctly* (except that the macro
names often intrude on the implementation's namespace).

I think that Pike is saying that the error (the incorrect dance) is
any form that puts the guard in the file rather than the form (rarely
seen) that prevents the include. It may be that you interpreted him
the same way as I, and your comment is meant to imply that you
disagree with view of what is incorrect. If so, sorry for the noise
but maybe the clarification helped others.

An good analogy is the common tendency to write:

#ifdef __cplusplus
extern "C" {
#endif
...
#ifdef __cplusplus
}
#endif


in a C .h file rather that providing another, three-line, C++ include
file that says:

extern "C" {
#include "c_decls.h"
}

<snip>
 
S

Stephen Sprunk

George said:
Most of the time I design programs by starting with the data I want or need
to store. So, I start with a data structure, and then I add functions and
procedures to create and destroy the data structure, and to manipulate the
actual data itself. I may just write a brief outline for the design of
those functions, or what I want them to do. Then I build them from the
bottom up.

"Show me your flowcharts and conceal your tables, and I shall continue
to be mystified. Show me your tables, and I won’t usually need your
flowcharts; they’ll be obvious."
-- Fred Brooks, _The Mythical Man-Month_


The vast majority of failed software projects I've seen started down
their dark path by violating the simple principle that you design the
data structures first, then write the (obvious) code to manipulate them.

S
 
K

Keith Thompson

Ben Bacarisse said:
I think that Pike is saying that the error (the incorrect dance) is
any form that puts the guard in the file rather than the form (rarely
seen) that prevents the include. It may be that you interpreted him
the same way as I, and your comment is meant to imply that you
disagree with view of what is incorrect. If so, sorry for the noise
but maybe the clarification helped others.

Yes, I think you're right, and I misinterpreted what Pike said. He's
saying that the #ifdef should be in the file that includes the header,
not in the header itself. And I still disagree with him. Putting the
#ifdef in the includer requires the client to know both the name of
the header file and the name of the macro; the later, IMHO, shouldn't
be something a client should have to worry about. But then, Pike
doesn't like the #ifdef method anyway; he's just suggesting what he
thinks is a less bad way of using it.

He's advocating adding clutter to code that uses a header to save the
compiler some effort. The compiler works for me; I don't work for it.

(It would be nice if C had some higher-level of module management than
textual inclusion, but that's not going to change any time soon.)
 
J

John Bode

Hi,

I've recently been *fascinated* about breaking C programs into
functions and what characteristics make up a good function. K&R 2nd
Ed. have a somewhat limited number of programs also briefly written in
pseudocode and Steve Summit's site has a few notes about the role of
functions. While I'm mostly hooked onto function design, I've also got
an interest in the reasoning behind global variables in C programs.
Can anyone point me to any other resources that have information about
the above interests?

TIA
Albert

No links, but I'll offer my not-so-humble opinions.

Good function design attempts to satisfy the following guidelines:

1. A function should perform a single, well-defined task. This task
may be broken down into smaller subtasks (each with its own
function). For example, sorting an array is a single, well-defined
task; sorting an array and opening a network connection are two
unrelated tasks that don't belong in the same function.

2. A function should be reusable, both within the same program and
among different programs. For example, look at the C standard
library. You can reuse functions like printf() or malloc() to your
heart's content; they are not bound to a specific program.

3. A function should isolate its implementation details from the
larger program. This helps make the function easier to reuse as well
as protecting against introducing bugs in the larger program by
editing the function. In other words, if when your main program calls
a function to sort an array, it doesn't care *how* the array gets
sorted, just that it's sorted in the correct order. You should be
able to change the sorting logic without affecting the larger program
at all.

4. Expanding on 3, a function should communicate with its caller
through its parameter list and return value *only*; it should not rely
on global variables to communicate with the caller. This exposes
implementation details to the caller, as well as making the function
hard to reuse.

Global variables should be used very sparingly. There are some
situations where they're the right solution, but more often than not a
better solution is available. They're typically used to persist some
state information between function calls, where that information is
not being stored in the calling program.

For a very contrived example, imagine a module that implements a stack
using an array for the stack contents and an integer for the stack
pointer. The stack manipulation functions (push and pop) need to
maintain the values of the stack and stack pointer between calls, but
that information is not being saved by the main program (since that's
an implementation detail, and we want to hide implementation details
from the main program). So you can define two file-scope (global)
variables that are visible to the push and pop functions, like so:

/* stack.c */
#define STACK_SIZE ...

/**
* The static keyword limits the visibility of the variables
* to this specific source file
*/
static int stack_ptr;
static int stack_data[STACK_SIZE];

void init(void)
{
stack_ptr = STACK_SIZE;
}

int push(int data)
{
int ret = 1;
if (stack_ptr > 0)
stack_data[--stack_ptr] = data;
else
ret = 0; /* stack overflow */
return ret;
}

int pop(int *data)
{
int ret = 1;
if (stack_ptr < STACK_SIZE)
*data = stack_data[stack_ptr++];
else
ret = 0; /* stack underflow */
return ret;
}

/* stack.h */
extern void init(void);
extern int push(int data);
extern int pop(int *data);

Now, there are easily a hundred better ways to implement a stack that
don't rely on global variables and still isolate implementation
details from the calling program (and allow you to create more than
one stack per program): don't take the above code as an example of a
*good* stack design. This is simply an example off the top of my head
of how global variables can be used to maintain state information.
 
N

Nelu

John Bode wrote:
Global variables should be used very sparingly. There are some
situations where they're the right solution, but more often than not a
better solution is available. They're typically used to persist some
state information between function calls, where that information is not
being stored in the calling program.

For a very contrived example, imagine a module that implements a stack
using an array for the stack contents and an integer for the stack
pointer.
[snip]

For a much less contrived example of where globals make sense, consider
localization of your program.

That example usually works for a fall-back default behavior.
Just how are you going to change the
language of every part of your human interface without using some global
variable? You aren't, because this is a global change to your program.

In JAVA, for example, methods whose execution depend on a locale usually
allow you to specify the locale as a parameter to make it more general.
There are also the alternatives that don't require the locale as a
parameter because they use the default.
 
K

Kaz Kylheku

There are alternatives. One is to have a pointer to an
environment struct that is passed through function calling
sequences. Another, perhaps friendlier, is to have a routine
that returns a pointer to the environment struct. An extension
of this is to have an environment id of some kind; the routine
returns a pointer to the selected environment struct. This makes
it easier to have multiple locales, etc, at the same time.

Trouble is that many functions don't know anything about this environment,
but you may need to smuggle it through these functions.

So what do you do? Add an environment argument to functions all
over the program which do nothing but pass it down to other functions?

Or do you stick the environment pointer into a global variable?

:)
 
G

Guest

Keith said:
George Peter Staplin said:
I agree with most of what is written here:
http://www.lysator.liu.se/c/pikestyle.html
You can certainly do worse than following that. [...]

Consistency of style is more important than (almost) any specific
style guideline.  The only thing worse than a style I dislike is a
mixture of different styles.
But I disagree fairly strongly with this rule:
    Simple rule: include files should never include include files.
    If instead they state (in comments or implicitly) what files
    they need to have included first, the problem of deciding
    which files to include is pushed to the user (programmer)
    but in a way that's easy to handle and that, by construction,
    avoids multiple inclusions.  Multiple inclusions are a bane
    of systems programming.  It's not rare to have files included
    five or more times to compile a single C source file.  The Unix
    /usr/include/sys stuff is terrible this way.
    There's a little dance involving #ifdef's that can prevent a file
    being read twice, but it's usually done wrong in practice - the
    #ifdef's are in the file itself, not the file that includes it.
    The result is often thousands of needless lines of code passing
    through the lexical analyzer, which is (in good compilers)
    the most expensive phase.
    Just follow the simple rule.
IMHO, an include file should define an interface (possibly implemented
partly via the use of other include files), and a single #include
directive should be sufficient to use that interface.
Suppose "foo.h" depends on "bar.h" today, but a new version depends
instead on "blorg.h", for reasons irrelevant to client code.  If foo.h
just includes the headers it needs, I don't need to update my source
code when it changes.  As for the "little dance" of #ifdef's, I've
found that it's almost always done *correctly* (except that the macro
names often intrude on the implementation's namespace).  And I'm not
at all convinced that the time spent re-scanning headers is a big
problem (I'm sure someone has measured this, but I don't have any
numbers).  I seem to recall that some compilers can recognize the
common pattern and can avoid re-scanning the entire file (though this
could theoretically make such a compiler non-conforming in the
presence of some really ugly code).
Having said that, if I worked on a project that imposed this as a
coding standard, I'd certainly follow it.

The problem that your rule quote is addressing:
is that the number of files that a system can have open at once,
is usually a small number, like 2 digits small;
When compiling a project of many files,
the number of files #included by other files,
even with ordinary header guards,
can get real high, real fast, real easy.

When compiling a project of less than many files,
it's nothing to worry about.

I have never encountered this problem. And I've seen projects
with 1000's of files.
 
N

Nate Eldredge

pete said:
The problem that your rule quote is addressing:
is that the number of files that a system can have open at once,
is usually a small number, like 2 digits small;
When compiling a project of many files,
the number of files #included by other files,
even with ordinary header guards,
can get real high, real fast, real easy.

2 digits? Are we living in 1988? (If so, set FILES=100 in your
CONFIG.SYS.)

My system allows over 11000 files open at once. 1024 is another common
limit. Those with only 255 per process are considered pretty
restrictive these days.

If any halfway reasonable structuring of source files runs you into your
system's open file limit, then you need a better development system, not
a new source file scheme.
 
J

John Bode

No links, but I'll offer my not-so-humble opinions.
Good function design attempts to satisfy the following guidelines:
1.  A function should perform a single, well-defined task.  This task
may be broken down into smaller subtasks (each with its own
function).  For example, sorting an array is a single, well-defined
task; sorting an array and opening a network connection are two
unrelated tasks that don't belong in the same function.
2.  A function should be reusable, both within the same program and
among different programs.  For example, look at the C standard
library.  You can reuse functions like printf() or malloc() to your
heart's content; they are not bound to a specific program.
3.  A function should isolate its implementation details from the
larger program.  This helps make the function easier to reuse as well
as protecting against introducing bugs in the larger program by
editing the function.  In other words, if when your main program calls
a function to sort an array, it doesn't care *how* the array gets
sorted, just that it's sorted in the correct order.  You should be
able to change the sorting logic without affecting the larger program
at all.
4.  Expanding on 3, a function should communicate with its caller
through its parameter list and return value *only*; it should not rely
on global variables to communicate with the caller.  This exposes
implementation details to the caller, as well as making the function
hard to reuse.
Global variables should be used very sparingly.  There are some
situations where they're the right solution, but more often than not a
better solution is available.  They're typically used to persist some
state information between function calls, where that information is
not being stored in the calling program.
For a very contrived example, imagine a module that implements a stack
using an array for the stack contents and an integer for the stack
pointer.  The stack manipulation functions (push and pop) need to
maintain the values of the stack and stack pointer between calls, but
that information is not being saved by the main program (since that's
an implementation detail, and we want to hide implementation details
from the main program).  So you can define two file-scope (global)
variables that are visible to the push and pop functions, like so:
/* stack.c */
#define STACK_SIZE ...
/**
* The static keyword limits the visibility of the variables
* to this specific source file
*/
static int stack_ptr;
static int stack_data[STACK_SIZE];
void init(void)
{
   stack_ptr =3D STACK_SIZE;
}
int push(int data)
{
   int ret =3D 1;
   if (stack_ptr > 0)
       stack_data[--stack_ptr] =3D data;
   else
       ret =3D 0; /* stack overflow */
   return ret;
}
int pop(int *data)
{
   int ret =3D 1;
   if (stack_ptr < STACK_SIZE)
       *data =3D stack_data[stack_ptr++];
   else
       ret =3D 0; /* stack underflow */
   return ret;
}
/* stack.h */
extern void init(void);
extern int push(int data);
extern int pop(int *data);
Now, there are easily a hundred better ways to implement a stack that
don't rely on global variables and still isolate implementation
details from the calling program (and allow you to create more than
one stack per program): don't take the above code as an example of a
*good* stack design.  This is simply an example off the top of my head
of how global variables can be used to maintain state information.

I appreciate that you are giving an illustrative example that
isn't meant to be used.  That said, an alternative to static
state variables is to use a struct that holds the state
variables.  Thus

struct stack_s {
    int stack_ptr;
    int * stack_data;

};

void stack_open(struct stack_s *stack)
{

    stack->stack_ptr  = 0;
    stack->stack_data = malloc(STACK_SIZE * sizeof(stack_data));

}

void stack_close(struct stack_s *stack)
{
    free(stack->data);
    stack->data = 0;

}

void stack_push(struct stack_s *stack, int datum)
{
    stack->stack_data[stack->stack_ptr++] = datum;

}

int stack_pop(struct stack_s * stack)
{
    return stack->stack_data[--stack_ptr];

}

Obviously I haven't put in the error checks; I'm just
illustrating the structure.  Some points:  In the caller's
declaration code I would have

    stack_s stack = {0};

so that the caller's stack is guaranteed to start as closed,
i.e., the routines can test whether stack_data is 0 or not.
As a more general remark, I feel that fixed size arrays are a bad
idea unless the size is guaranteed by the nature of the problem.

YMMV

Richard Harter, [email protected]://home.tiac.net/~cri,http://www.varinoma.com
Save the Earth now!!
It's the only planet with chocolate.- Hide quoted text -

- Show quoted text -

Yes, exactly, that's the right way to do it. I just remember that
particular example from an edition of H&S, and contrived as it is, I
couldn't think of a better one.

Shows how much I actually use the damn things.
 
I

Ian Collins

Yes, exactly, that's the right way to do it. I just remember that
particular example from an edition of H&S, and contrived as it is, I
couldn't think of a better one.

Shows how much I actually use the damn things.

You also forgot how to snip!
 
P

Phil Carmody

Nate Eldredge said:
2 digits? Are we living in 1988? (If so, set FILES=100 in your
CONFIG.SYS.)

My system allows over 11000 files open at once. 1024 is another common
limit. Those with only 255 per process are considered pretty
restrictive these days.

If any halfway reasonable structuring of source files runs you into your
system's open file limit, then you need a better development system, not
a new source file scheme.

Indeed. Even a 2-digit number of files, say 30, would permit nesting
of include files 30 levels deep, which is vastly different from the
limit of 2 (source and include file, no recursion) being suggested.
Any sane C compiler will close every file handle once it's reached the
end of that file.

Phil
 
G

Guest

a localisation library.

void app_code ()
{
if (no_more_therbligs)
fprintf (stderr, localise_msg (THERBLIG_ALLOCATION_ERROR));
}

the localisation library needs to be told what language (etc)
it needs to use which it holds as a static variable internal
to the library. You can even change it on the fly. There are libraries
provide support for this sort of stuff. It can't handle multiple
localisations in the same program without some sort of context
variable, but this sounds more esoteric to me. Normally only
the GUI needs localising and then each GUI can, potenially,
use a differsnt language.

even exposing the environmental variable seems like
it's exposing a bit too much of the low level machinary.


I wouldn't expose the environmental variable at all.
Ok, I might provide an access function but I wouldn't
expect it to be heavily used.

if (get_lang() == LANG_DE)
{
enlarge_verb_stack();
}

arg! You don't declare *anything* in main().


encapsulated libraries. Objects if you want to use buzzwords
You have made the assumption that they weren't already in a single struct..  It
is always good practice to gather similar things together.


If you aren't declaring it inside main then, just where are you proposing to
declare it?  

in a library. That is a C file other than the one that holds main().
Have you worked with a large program (>100 files)?

Inside a function?  If it isn't to be global you can't declare it
at file scope as would be the case of an included file!


That may depend on the function of the program.  I can see a computation program
needing much different passing than say an interactive game program.

does a "computation program" need localisation?
Recursion could muddy that water a bit, but in general I grant most of the time
of a normal program isn't spent in the calling sequences.


You again make two assumptions.  First that they already aren't in a single
struct and second that the problem being solved is something that lends itself
to being put into a library

why can't you use a library? This isn't an assumption, almost
anything (there's challenge!) can be abstracted away
to some sort of library.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top