Thoughts on file organisation

E

EventHelix.com

Hi,

I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?

Is it more usual to put these in the mylib.h file, or to put them at the
top of the mylib.c file, or to create a separate mylib_private.h file?

And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter.
Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
the FILE structure defined.

But say in the implementation, in mylib.c, I need to use (for example)
malloc. Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

Thanks for any input!

DM

The following article should help with defining a strategy for header
file includes.

http://www.eventhelix.com/RealtimeMantra/HeaderFileIncludePatterns.htm

The article is written for C++ but most of the ideas presented here
are applicable to C as well.
 
K

Keith Thompson

cr88192 said:
Keith Thompson said:
cr88192 said:
news:[email protected]... [...]
Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

also, be careful with terms like "obviously" as well.
if one is wrong, then it can make them look arrogant and/or stupid...

now, not all non-static functions are part of the "public interface"
either (this is especially true once the complexity of a library moves
much past "trivial").

If a function isn't part of the public interface, why would you not
declare it as static?

for a trivial library, maybe.

but for probably "most" of us, I would think, our libraries are
anywhere from 10 to 50 kloc, and may involve some number of
interdependent source files.

as a result, different parts of the library will depend on each other,
and thus need to be able to see each others' functions, however, the
client for this library, may have no good reason to see any of these
functions (instead being given a specific public API).

as a result, many non-public functions can't be static either...

as such, 'static' usually ends up being restricted to very narrowly
defined and context-specific functions, that we can be pretty sure
will not need to be called from another source file...

You're right. There are different levels of "publicness"; a function
that shouldn't be visible to clients of a library might well need to
be visible to other source files within the same library. The problem
is that C provides only two levels rather than an arbitrary hierarchy
that might better represent an application's actual requirements. In
practice, it's not too hard to deal with this; it mostly requires some
care in the choice of identifiers so that things that aren't supposed
to be visible don't cause collisions.
 
A

Army1987

David said:
Hi,

I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?

Is it more usual to put these in the mylib.h file, or to put them at the
top of the mylib.c file, or to create a separate mylib_private.h file?
There is no point in placing them in mylib.h. If they are only used in
mylib.c, place them in mylib.c. If they're used on other files which are
part of the library implementation, place them in a separate file.
And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter. Obviously,
I'll need to #include<stdio.h> at the top of mylib.h to get the FILE
structure defined.

But say in the implementation, in mylib.c, I need to use (for example)
malloc. Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

If there is no reason why every program calling functions from mylib.c
needs the declarations in stdlib.h, there is no point in including it in
the header used for the interface.
 
F

fnegroni

David said:
And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter. Obviously,
I'll need to #include<stdio.h> at the top of mylib.h to get the FILE
structure defined.

There is a school of thought (see Plan9) that says you should not
include headers within heathers. If your heather file mylib.h depends
on stdlib.h for its declarations, you should only state that
dependency (in a doc file or a comment at the beginning of the header,
or indeed both) and let the user decide which stdlib.h to include.

To this effect the authors of Plan9 (the original authors of unix +
some), recommend not to use header guards.
 
A

Al Balmer

There is a school of thought (see Plan9) that says you should not
include headers within heathers. If your heather file mylib.h depends
on stdlib.h for its declarations, you should only state that
dependency (in a doc file or a comment at the beginning of the header,
or indeed both) and let the user decide which stdlib.h to include.

To this effect the authors of Plan9 (the original authors of unix +
some), recommend not to use header guards.

Really? What's the perceived benefit?
 
J

Jensen Somers

Al said:
Really? What's the perceived benefit?

I don't see any benefits using this approach.

When you have a header file defining some project specific types and you
use those types in function declarations, people will expect you include
the header file defining them in your header file containing the
function declarations.
Not doing this will make it for most people a lot more difficult to work
and debug.

- Jensen
 
F

fnegroni

Really? What's the perceived benefit?

Flexibility, standards compliant in hosted implementations,
cleanliness of headers, reduce maintenance headache.

Here is an extract from the Plan9 C compiler manual/tutorial:

Every Plan 9 C program begins

#include <u.h>
because all the other installed header files use the typedefs declared
in <u.h>.
In strict ANSI C, include files are grouped to collect related
functions in a single file: one for string functions, one for memory
functions, one for I/O, and none for system calls. Each include file
is protected by an #ifdef to guarantee its contents are seen by the
compiler only once. Plan 9 takes a completely different approach.
Other than a few include files that define external formats such as
archives, the files in /sys/include correspond to libraries. If a
program is using a library, it includes the corresponding header. The
default C library comprises string functions, memory functions, and so
on, largely as in ANSI C, some formatted I/O routines, plus all the
system calls and related functions. To use these functions, one must
#include the file <libc.h>, which in turn must follow <u.h>, to define
their prototypes for the compiler. Here is the complete source to the
traditional first C program:

#include <u.h>
#include <libc.h>

void
main(void)
{
print("hello world\\n");
exits(0);
}
The print routine and its relatives fprint and sprint resemble the
similarly named functions in Standard I/O but are not attached to a
specific I/O library. In Plan 9 main is not integer valued; it should
call exits, which takes a string (or null; here ANSI C promotes the 0
to a char*) argument. All these functions are, of course, documented
in the Programmer's Manual.
To use printf, <stdio.h> must be included to define the function
prototype for printf:

#include <u.h>
#include <libc.h>
#include <stdio.h>

void
main(int argc, char *argv[])
{
printf("%s: hello world with %d arguments\n", argv[0],
argc-1);
exits(0);
}
 
F

fnegroni

The above follows from a paper by Rob Pike about headers and header
guards. A quick look on Google should bring up the paper quickly.
 
A

Al Balmer

Flexibility, standards compliant in hosted implementations,
cleanliness of headers, reduce maintenance headache.

Here is an extract from the Plan9 C compiler manual/tutorial:

The given extract does not recommend against using header guards, nor
does it actually recommend not including headers within headers. It
does give one example of not doing so, but no rationale. Perhaps there
are reasons not to change an existing libc.h.

IAC, this is sure to *add* to maintenance headaches, not reduce them,
and the only flexibility it adds is the opportunity to scratch your
head about which headers are missing. Sorry, I don't see the benefit.
 
C

christian.bau

My personal rule would be: Any header file should compile fine if it
is included at the top of a source file. And any header file should
compile fine if it is included twice. So

myfile.c:
#include "mylib.h"

and

myfile.c:
#include "mylib.h"
#include "mylib.h"

should both compile without problems. Forward declarations of structs
can help with this, if you don't want to include too many files in
mylib.h. Guards inside the file (#ifndef MYLIB_HEADER_ #define
MYLIB_HEADER_ <real code> #endif) help as well.

I'd hope that any preprocessor nowadays should be clever enough not to
actually read header files with guards twice.
 
F

Flash Gordon

cr88192 wrote, On 27/01/08 23:03:

and the documentation for mylib says it depends on vector.lib, or rather
myvector.lib since it is in this example provided by the same person.
this kind of thing is annoyingly common, and will often end up
necessitating duplicating many smaller pieces of code from one place to
another.

So you like breeding bugs then? After all, if there is a bug in the
routine you have just copied it...
as a result, we end up with slight variations of cross-product spread
throughout the project...

Leading to yet more bugs and not being able to even copy the bug fix
over without risk of introducing more bugs...
still, some minor duplication is often somewhat preferable to a dependency.

a duplication, is a matter of a small amount of bloat.

No, splitting it out in to a proper seperate library is better.
a dependency, OTOH, ends up imposing on the client as well (oh crap, can
no longer use libfoo, because it uses a few trivial string functions
from libbar, and a port of libbar does not exist for the current
architecture, ...).

Well, since libbar is code you wrote you just port it. You would have
had to port the code if it was in libfoo anyway.
or, at least, it is annoying to want to use one thing, and have to link
"discover" and link in bunches of other things (oh, now WTF lib do I
need for 'function_with_too_damn_many_words_but_no_library_name'...).

This is one of the purposes of documentation.
 
F

fnegroni

]> The given extract does not recommend against using header guards,
nor
does it actually recommend not including headers within headers. It
does give one example of not doing so, but no rationale. Perhaps there
are reasons not to change an existing libc.h.

IAC, this is sure to *add* to maintenance headaches, not reduce them,
and the only flexibility it adds is the opportunity to scratch your
head about which headers are missing. Sorry, I don't see the benefit.

That's the opinion of Al Balmer.

Let's see what Rob Pike's opinion is, shall we?

Who is Rob Pike? http://herpolhode.com/rob/

Extract from http://www.lysator.liu.se/c/pikestyle.html

Include files

Simple rule: include files should never include include files.
If instead they state (in comments or implicitly) what files they need
to have included first, the problem of deciding which files to include
is pushed to the user (programmer) but in a way that's easy to handle
and that, by construction, avoids multiple inclusions. Multiple
inclusions are a bane of systems programming. It's not rare to have
files included five or more times to compile a single C source file.
The Unix /usr/include/sys stuff is terrible this way.
There's a little dance involving #ifdef's that can prevent a
file being read twice, but it's usually done wrong in practice - the
#ifdef's are in the file itself, not the file that includes it. The
result is often thousands of needless lines of code passing through
the lexical analyzer, which is (in good compilers) the most expensive
phase.

Just follow the simple rule.
 
F

Flash Gordon

fnegroni wrote, On 28/01/08 20:59:
]> The given extract does not recommend against using header guards,
nor
does it actually recommend not including headers within headers. It
does give one example of not doing so, but no rationale. Perhaps there
are reasons not to change an existing libc.h.

IAC, this is sure to *add* to maintenance headaches, not reduce them,
and the only flexibility it adds is the opportunity to scratch your
head about which headers are missing. Sorry, I don't see the benefit.

That's the opinion of Al Balmer.

It is the opinion of rather more than just Al Balmer
Let's see what Rob Pike's opinion is, shall we?

Who is Rob Pike? http://herpolhode.com/rob/

Extract from http://www.lysator.liu.se/c/pikestyle.html

Things have moved on a bit since 1989.
Include files

Simple rule: include files should never include include files.
If instead they state (in comments or implicitly) what files they need
to have included first, the problem of deciding which files to include
is pushed to the user (programmer) but in a way that's easy to handle

I.e. giving the programmer more work.
and that, by construction, avoids multiple inclusions. Multiple
inclusions are a bane of systems programming.

Not these days they are not.
It's not rare to have
files included five or more times to compile a single C source file.
The Unix /usr/include/sys stuff is terrible this way.

Well, this tends to suggest that some other well known people disagree
with Pike.
There's a little dance involving #ifdef's that can prevent a
file being read twice, but it's usually done wrong in practice - the
#ifdef's are in the file itself, not the file that includes it. The

Actually that is the correct place for them because then they only have
to be written once. Generally it is better to let the computer do the
monotonous work than to leave it to the programmer.
result is often thousands of needless lines of code passing through
the lexical analyzer, which is (in good compilers) the most expensive
phase.

This is not an issue with modern compilers.
Just follow the simple rule.

A rule that many people disagree with for good reasons.
 
P

pete

David said:
Hi,

I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

Specifically, consider a moderately
complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.
Obviously.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?

Is it more usual to put these in the mylib.h file,
or to put them at the top of the mylib.c file,
or to create a separate mylib_private.h file?

I declare them at the top of the c file,
and define static functions lower down.

http://www.mindspring.com/~pfilandr/C/lists_and_files/list_lib.c
And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter.

I have an h file like that.

http://www.mindspring.com/~pfilandr/C/lists_and_files/list_lib.h
Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
the FILE structure defined.
Obviously.

But say in the implementation, in mylib.c,
I need to use (for example) malloc.

list_lib.c is like that.
Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

I put those kinds of includes, in the c file.

My list_lib.c has
#include <stdlib.h>
#include <string.h>
but my list_lib.h, doesn't.

I have a toy library here:
http://www.mindspring.com/~pfilandr/C/library/
I comment lists of which features from the included
c standard headers are being used, in the c files and h files.
 
C

cr88192

Flash Gordon said:
cr88192 wrote, On 27/01/08 23:03:

and the documentation for mylib says it depends on vector.lib, or rather
myvector.lib since it is in this example provided by the same person.


So you like breeding bugs then? After all, if there is a bug in the
routine you have just copied it...

usually, this is a minor problem (rarely has this been a problem in
practice).

Leading to yet more bugs and not being able to even copy the bug fix over
without risk of introducing more bugs...

whatever dude...

it is not like we are a bunch of monkeys bashing on a keyboard.
theoretically we have enough brains to know just what the hell we wrote...

No, splitting it out in to a proper seperate library is better.

it introduces increased cost in terms of dependency management, which IS a
problem in practice.

of course, a matter of scale is also involved (I am thinking more in the
hundreds of kloc range).

Well, since libbar is code you wrote you just port it. You would have had
to port the code if it was in libfoo anyway.

not often the case as I have seen with 3rd party libraries.

also, note that, when codebases get large, it is a major hassle to drag
around much more than pieces of previous projects.

grab pieces, tack them together, make something that works.

we don't want to drag in 500kloc or 1Mloc of code just to get some simple
app working, it is better just to grab what is needed. this means, keeping
some level of control over dependency issues.

it is also much worse, for example, when a bug or change in one part of the
codebase, due to uncontrolled dependencies, manages to disrupt the operation
of many of the other subsystems (or leaves the pain of having to rewrite a
part of the codebase to break up these dependencies).

This is one of the purposes of documentation.

of course, this only assumes the people have read the documentation.

this is especially a problem with building cross-platform apps and libs (or
trying to get most linux apps to build on windows).

it is best to avoid using anything that may not exist on the other arch.

 
R

Richard Bos

fnegroni said:
I am not a noob

Then you should learn to quote. If your programming style is similar to
your Usenet style, no wonder you want everything as flat as possible.

Richard
 
C

cr88192

Flash Gordon said:
fnegroni wrote, On 28/01/08 20:59:


Actually that is the correct place for them because then they only have to
be written once. Generally it is better to let the computer do the
monotonous work than to leave it to the programmer.

yes.

more effort on the part of the programmer, or slightly faster builds?...
who's time is more expensive here?...

This is not an issue with modern compilers.

yes, and it is not the lexer that is bogged down by this either, it is the
preprocessor...

(the lexer is bogged down by mode code, yes, but most of this is pruned away
before it is ever even seen).


then again, I remember "back in the day" (tcc, tasm, and dos), where
grinding through all these system headers could be pretty annoyingly slow.

good for me my computer is about 200x faster (and dual core, and with a more
efficient pipeline, ...).

waiting for the preprocessor is not such a big problem anymore (nor are
build times that scarily long...).

Just follow the simple rule.

A rule that many people disagree with for good reasons.[/QUOTE]
 
F

fnegroni

Then you should learn to quote.

And you Richard Bos should learn to quote sentences IN FULL to the
full stop mark, instead of extracting the parts that make what you say
look good.

I said I was not a noob of C programming, what has that got to do with
my usenet style? Nothing.

Granted my usenet style is not as great and devoid of mistakes as
yours, but I *will* improve with time.
 
M

Mark McIntyre

fnegroni said:
And you Richard Bos should learn to quote sentences IN FULL to the
full stop mark,

And you should learn that this is the real world, where people don't
have to quote you in full and your words may be taken out of context.
instead of extracting the parts that make what you say
look good.

Or making you look foolish.
Granted my usenet style is not as great and devoid of mistakes as
yours, but I *will* improve with time.

Sure. Meanwhile, don't get too worried if people seem to be picking on
you. They're not.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top