Thoughts on file organisation

D

David Mearsen

Hi,

I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?

Is it more usual to put these in the mylib.h file, or to put them at the
top of the mylib.c file, or to create a separate mylib_private.h file?

And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter.
Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
the FILE structure defined.

But say in the implementation, in mylib.c, I need to use (for example)
malloc. Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

Thanks for any input!

DM
 
W

Willem

David wrote:
) I've recently started programming C after many years using "the other
) language"... I just wanted to find out the common practice for
) organising source files.

What's "the other language" ? Cobol ?

) Specifically, consider a moderately complicated library module, mylib.c.
) Obviously its "public interface" (i.e. non-static function
) declarations, typedefs, any global variables) need to go in mylib.h.
)
) The question is: what about private (i.e. static) functions and struct
) declarations and typedefs only used in the private implementation?
)
) Is it more usual to put these in the mylib.h file, or to put them at the
) top of the mylib.c file, or to create a separate mylib_private.h file?

My personal prefecernce is to put them at the top of the .c file.
*_private.h files are useful when you have several .c functions that
form a package/module/library.

Also, if you define static functions before you use them, you don't need
to define them, which removes redundant information. IMO, this is a good
thing, but other opinions may differ.

) And a similar question for #includes: let's suppose that one of the
) public functions declared in mylib.h takes a FILE* parameter.
) Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
) the FILE structure defined.
)
) But say in the implementation, in mylib.c, I need to use (for example)
) malloc. Then I need to #include<stdlib.h> as well. Should I put the
) #include at the top of mylib.h or at the top of mylib.c?

I'd say, at the top of mylib.c.
There are also people who don't include any system libs in a .h file,
but specify that it needs to be included whenever mylib.h is included.
I personally think it's a bad practice but I've seen it done.

Bottom line: keep the .h file to a minimum, but make sure it can stand
alone. This is my personal opinion, of course. It might, or might not
be the "industry standard".


SaSW, Willem
--
Disclaimer: I am in no way responsible for any of the statements
made in the above text. For all I know I might be
drugged or something..
No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT
 
S

santosh

David said:
Hi,

I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

Specifically, consider a moderately complicated library module,
mylib.c. Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?

Is it more usual to put these in the mylib.h file, or to put them at
the top of the mylib.c file, or to create a separate mylib_private.h
file?

IMHO, the last option is the best one.
And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter.
Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
the FILE structure defined.

But say in the implementation, in mylib.c, I need to use (for example)
malloc. Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

The latter.
 
E

Eric Sosman

David said:
Hi,

I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?

Is it more usual to put these in the mylib.h file, or to put them at the
top of the mylib.c file, or to create a separate mylib_private.h file?

"Private parts" don't belong in the public header file,
because they would then cease to be private and would become
part of your published interface. If you can, keep the private
definitions inside the library's .c files. If the library has
several .c files that must share a set of private declarations,
putting them in a mylib_private.h file is about the best you
can do.

A somewhat related matter: It is an excellent idea to
#include "mylib.h" in all the library's source files. That
way, the compiler can alert you if the declarations and the
definitions get out of step.
And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter.
Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
the FILE structure defined.

But say in the implementation, in mylib.c, I need to use (for example)
malloc. Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

Opinions differ on this one. I am of the "headers should
#include other headers if they need them" persuasion, but there
is also a "nested #includes are evil" party to which some non-
stupid people belong. Choose your allegiance, and thereafter
do not waver.
 
F

fnegroni

You will also find that in regular linkers, you will find yourself
declaring *one* public function per .c file.
Reason being linkers on unix, in general, are dumb, in a good way, and
try and keep binary compatibility amongst object formats by not
splitting functions themselves.
What this means is that if you have a large library, chances are your
client code will only use a bunch of the functions, and will most
likely appreciate if the static linking only takes place for those
functions that are actually used.
Dynamic linking doesn't impose such restriction, but it is good
practice, in library code development, to place each public function
in separate files anyway to reduce source control and maintenance
headaches.
 
M

Malcolm McLean

David Mearsen said:
I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?
Declare everything you want user to be able to access in "mylib.h". He ought
to be able to get the lot, except maybe FILE *s and the like, by simply
including one file.
Personally I expose most structures (though not in Baby X, my X windows
toolkit) to help debugging or coding round library mistakes, even if
intended to be opaque.

Unfortunately mylib might well depend on something, for instance the highly
general string functions I suggested in the xmalloc string functions post.
Ideally we would like these to be private to mylib.h to avoid creating
complex dependencies. There might also be functions which are uniqure to
mylib.h, but not suitable to call directly, and cannot sensibly be declared
static.

The multiple dependency problem is a very real one. A simple cross-product
routine can suck in "vector.h" and "vector.c" which sucks in a load of
matrix algebra, typedefs for floats, a fast sine library, and a special
memory allocator. There's no real answer in standard C. I'd say it is the
number one weakness of the language.
 
A

Andrey Tarasevich

David said:
I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?

Is it more usual to put these in the mylib.h file, or to put them at the
top of the mylib.c file, or to create a separate mylib_private.h file?

If your idea of a "module" is synonymous with a ".c file", the you don't need an
extra file since you can indeed put all internal declarations at the beginning
of the .c file. I don't see the point of creating an extra include file for
this. The rationale for creating an include file is to be able to include it
into several translation units. This obviously doesn't apply here.

However, I personally find the it is more useful not to restrict a concept of a
"module" to a single implementation file. In that case a three-level file
structure might make sense. There's a single "external interface" include file,
like 'mylib_api.h', which declares the external interface of the module. There's
one or more "internal interface" files, like 'mylib1.h', 'mylib2.h' etc, which
declare interfaces between various implementation files within the module.
And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter.
Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
the FILE structure defined.

But say in the implementation, in mylib.c, I need to use (for example)
malloc. Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

If you don't need any declarations from <stdlib.h> in the .h file, then put it
in .c file. However, obviously, keeping it formally "clean" within this approach
might require a considerable maintenance effort. Let's say eventually you'll
need something declared in <stdlib.h> in your .h file. Then you'll have to add
the declaration to your .h and remove it from all of your .c files. And vice
versa. At least with standard headers, it might prove to be more efficient to
always include them into .h files.

Also, sometimes certain compiler features (like pre-compiled header support)
might dictate their own style of header inclusion.
 
P

Pierre Asselin

David Mearsen said:
Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.
The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?
Is it more usual to put these in the mylib.h file, or to put them at the
top of the mylib.c file, or to create a separate mylib_private.h file?

In mylib.c, except that you if you have shared declarations among
mylib_1.c, mylib_2.c etc. those need to go in a mylib_private.h .

And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter.
Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
the FILE structure defined.

That's my preference. If I #include "mylib.h" I like my code to still
compile. Some programmers prefer to make you include stdio.h yourself,
and document that fact.
But say in the implementation, in mylib.c, I need to use (for example)
malloc. Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

In mylib.c, because it is not needed to compile client code.
 
C

cr88192

fnegroni said:
You will also find that in regular linkers, you will find yourself
declaring *one* public function per .c file.
Reason being linkers on unix, in general, are dumb, in a good way, and
try and keep binary compatibility amongst object formats by not
splitting functions themselves.
What this means is that if you have a large library, chances are your
client code will only use a bunch of the functions, and will most
likely appreciate if the static linking only takes place for those
functions that are actually used.
Dynamic linking doesn't impose such restriction, but it is good
practice, in library code development, to place each public function
in separate files anyway to reduce source control and maintenance
headaches.

that is, of course, assuming that the goal is having a small binary, and
that we will not "go introspective" within this program (for example,
ripping out and directly working with the symbol table, for example, in a
VM's FFI, dynamically retrieving functions via dlsym, ...).

the alternative, is to make it so that pretty much all files have token
dependencies on them, or, alternatively, to link the whole lib into a single
large object (the '.lo' approach).

so, it depends on goals, and a several MB or more binary may well be
acceptable.

another factor is the nature of the library:
for a lib where pretty much everything in independent (such as the C
runtime, ...) splitting into many files makes sense;
for a lib where pretty much everything is interconnected (such as a
rigid-body physics engine, rendering engine, or a compiler), then splitting
each function into a separate file is frivolous.


my usual practice:
often I don't care that much about exact binary size, so, I use however many
functions is convinient, and split up files "how they make sense" (usually
this means anywhere from 500 loc to 2 kloc per source file).
 
C

cr88192

David Mearsen said:
Hi,

I've recently started programming C after many years using "the other
language"... I just wanted to find out the common practice for
organising source files.

there are many "other languages"...

a simple hueristic ranking would look something like this:
1. C++
2. Java
3. VB
4. C#
5. Python
6. Perl
....

but, one has no idea of knowing which.

also note that it is a common misconception (especially among moderate-use
languages) for the users to somehow think that their language rules the
world (I have seen this attitude especially bad in VB coders).

for high-use languages, such a C, C++, and Java, this view is acceptable,
but after this the drop is sharp (albeit, each needs to acknowlege the
others, and not simply assume that "everybody" uses their language).

for low-use languages (esp in the Scheme, Common Lisp, OCaml, ...
communities), since the illusion of world dominance is unmaintainable, the
more common attitude becomes that of superiority and elitism (because I use
this lang, I am so superior to the ignorant masses who know nothing about
this language, ...).


so, one needs be careful with assumptions.

Specifically, consider a moderately complicated library module, mylib.c.
Obviously its "public interface" (i.e. non-static function
declarations, typedefs, any global variables) need to go in mylib.h.

also, be careful with terms like "obviously" as well.
if one is wrong, then it can make them look arrogant and/or stupid...

now, not all non-static functions are part of the "public interface" either
(this is especially true once the complexity of a library moves much past
"trivial").


as for global variables:
I will personally somewhat recommend against basing any public API on global
variables;
IMO, it is a much better idea to make use of getter and setter functions.

this is especially true if one may need to, for example, attach logic to
some particular variables, or reorganize the libraries' internals.

in a few cases, I have seen examples of where the lib had originally used a
global as part of its API, but later replaced it with some ugly macro
wrapping a function call:

#define foo (*(int *)(mylib_getFooPtr()))

though, maybe acceptable, it is not very nice either...

in fact, this is a common practice in implementing 'errno' as well.

The question is: what about private (i.e. static) functions and struct
declarations and typedefs only used in the private implementation?

Is it more usual to put these in the mylib.h file, or to put them at the
top of the mylib.c file, or to create a separate mylib_private.h file?

this is a personal preference, but in my case, usually any structs or
typedefs go into the headers.

whether I maintain seperate public and private headers (or, use the same
headers and an ifdef to handle private contents), depends on the specifics
of what I am writing.

example:
#include <mylib.h> //includes header, and gets public declarations

but:
#define MYLIB_INTERNAL
#include <mylib.h> //includes header, and gets private declarations as
well

in cases where I distinguish them, personally I usually use an '_i' suffix.
And a similar question for #includes: let's suppose that one of the
public functions declared in mylib.h takes a FILE* parameter.
Obviously, I'll need to #include<stdio.h> at the top of mylib.h to get
the FILE structure defined.

be careful with 'obviously'...

it may well be the case that one can require any such headers to be included
prior to the API header...

But say in the implementation, in mylib.c, I need to use (for example)
malloc. Then I need to #include<stdlib.h> as well. Should I put the
#include at the top of mylib.h or at the top of mylib.c?

arguments can be made in both cases, but, generally:
if the code needed for the library (internally) is unneeded by the client of
the app, it may not be a good idea to include the header from within the
header (will slow compilation time, ... for no real gain).


however, it may also provide a means to centrally control allowable
dependencies (for example, when the compiler is set to treat missing
prototypes as an error condition).

in this way (disallowing including other headers from within source files),
we can be certain if and where there is any accidental deviation from our
allowed list of dependencies (IMO, as a project scales much, controlling
allowed interdependencies can become an important factor, and it is much
easier to create dependencies, than to eliminate them).


as such, it may make sense to include system headers, for the internal
headers/sections.

#ifdef MYLIB_INTERNAL
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

#include <stdint.h>
....
#endif
 
S

santosh

cr88192 said:
there are many "other languages"...

a simple hueristic ranking would look something like this:
1. C++
2. Java
3. VB
4. C#
5. Python
6. Perl
...

I think it's fairly clear that he means C++, which is often jokingly
called "That Other Language" by some C programmers.

<snip>
 
K

Keith Thompson

cr88192 said:
also, be careful with terms like "obviously" as well.
if one is wrong, then it can make them look arrogant and/or stupid...

now, not all non-static functions are part of the "public interface"
either (this is especially true once the complexity of a library moves
much past "trivial").

If a function isn't part of the public interface, why would you not
declare it as static?

[...]
 
I

Ian Collins

Keith said:
If a function isn't part of the public interface, why would you not
declare it as static?
It might be called from a different compilation unit within the library.
 
S

santosh

Keith said:
If a function isn't part of the public interface, why would you not
declare it as static?

Presumably it's used by several source files within the same project,
but is not meant to be used by code from outside the project.
 
M

Malcolm McLean

santosh said:
Presumably it's used by several source files within the same project,
but is not meant to be used by code from outside the project.
The other nuisance is when it would make perfect sense for it to be used by
files outside the project, but isn't part of the core fucntionality.
For instance a 3D geometry library in a protein manipulation program. You're
going to call functions like crossproduct() and so on. However the purpose
of the package is to handle proteins, not impose particular 3d geometry
routines on everyone.
 
S

santosh

Malcolm said:
The other nuisance is when it would make perfect sense for it to be
used by files outside the project, but isn't part of the core
fucntionality. For instance a 3D geometry library in a protein
manipulation program. You're going to call functions like
crossproduct() and so on. However the purpose of the package is to
handle proteins, not impose particular 3d geometry routines on
everyone.

Well then, the obvious solution would be separate it into another
library. Sometimes this might be worth the extra effort and complexity,
sometimes not.

However we are drifting away from C to programming in general.
 
M

Malcolm McLean

santosh said:
Well then, the obvious solution would be separate it into another
library. Sometimes this might be worth the extra effort and complexity,
sometimes not.

However we are drifting away from C to programming in general.
Then mylib becomes dependent on vector.lib.
 
C

cr88192

Keith Thompson said:
If a function isn't part of the public interface, why would you not
declare it as static?

for a trivial library, maybe.

but for probably "most" of us, I would think, our libraries are anywhere
from 10 to 50 kloc, and may involve some number of interdependent source
files.

as a result, different parts of the library will depend on each other, and
thus need to be able to see each others' functions, however, the client for
this library, may have no good reason to see any of these functions (instead
being given a specific public API).


as a result, many non-public functions can't be static either...

as such, 'static' usually ends up being restricted to very narrowly defined
and context-specific functions, that we can be pretty sure will not need to
be called from another source file...

[...]

--
Keith Thompson (The_Other_Keith) <[email protected]>
Nokia
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
 
C

cr88192

Malcolm McLean said:
Then mylib becomes dependent on vector.lib.

yes.

this kind of thing is annoyingly common, and will often end up necessitating
duplicating many smaller pieces of code from one place to another.

as a result, we end up with slight variations of cross-product spread
throughout the project...
still, some minor duplication is often somewhat preferable to a dependency.

a duplication, is a matter of a small amount of bloat.


a dependency, OTOH, ends up imposing on the client as well (oh crap, can no
longer use libfoo, because it uses a few trivial string functions from
libbar, and a port of libbar does not exist for the current architecture,
....).

or, at least, it is annoying to want to use one thing, and have to link
"discover" and link in bunches of other things (oh, now WTF lib do I need
for 'function_with_too_damn_many_words_but_no_library_name'...).

 
B

Bartc

I've recently started programming C after many years using "the other
language"

Just out of interest, why did you move to C from that 'other language' (I
guess C++ of something of that ilk)?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top