Thoughts on file organisation

Malcolm McLean · Jan 29, 2008

cr88192 said:
waiting for the preprocessor is not such a big problem anymore (nor are
build times that scarily long...).

I don't even use makefiles any more. Just gcc *.c -lm.

Al Balmer · Jan 29, 2008

]> The given extract does not recommend against using header guards,
nor

does it actually recommend not including headers within headers. It
does give one example of not doing so, but no rationale. Perhaps there
are reasons not to change an existing libc.h.

IAC, this is sure to *add* to maintenance headaches, not reduce them,
and the only flexibility it adds is the opportunity to scratch your
head about which headers are missing. Sorry, I don't see the benefit.

Click to expand...

That's the opinion of Al Balmer.

Let's see what Rob Pike's opinion is, shall we?

Who is Rob Pike? http://herpolhode.com/rob/

I'm well aware of who Rob Pike is

The notes you reference were
written in disagreement of practices espoused by Brian Kernighan and
P. J. Plauger. (I suppose you know who they are?)

While I respect Mr. Pike and the contributions he's made, IMO he's all
wet on this issue. He apparently expects that the programmer read
every header before using it, or at least inspects it for comments
specifying its prerequisites. The only benefit he offers is saving
work for the compiler. Perhaps compilers have gotten better since
1989, but I don't find that a compelling reason to make such a mess.

Richard Tobin · Jan 29, 2008

Al Balmer said:
While I respect Mr. Pike and the contributions he's made, IMO he's all
wet on this issue. He apparently expects that the programmer read
every header before using it, or at least inspects it for comments
specifying its prerequisites.

While I don't use this style myself, it makes slightly more sense if
headers correspond directly to libraries. On most systems you're
going to have to find out what the header/library uses anyway, so that
you can specify it to the linker.

-- Richard

fnegroni · Jan 29, 2008

On Mon, 28 Jan 2008 12:59:17 -0800 (PST), fnegroni
The notes you reference were
written in disagreement of practices espoused by Brian Kernighan and
P. J. Plauger. (I suppose you know who they are?)

So why is it that a standard header never includes another standard
header?

Why should mylib.h, a third party file which I should not have to
modify, include <stdio.h> just because it uses FILE ?
Surely I can supply my own header which defines a compatible but
otherwise different implementation of FILE * that satisfies the
requirements of mylib (and mylib.h). Why should mylib.h override my
definition of FILE ?

Al Balmer · Jan 29, 2008

Why should mylib.h, a third party file which I should not have to
modify, include <stdio.h> just because it uses FILE ?
Surely I can supply my own header which defines a compatible but
otherwise different implementation of FILE * that satisfies the
requirements of mylib (and mylib.h).

If you worked here, I would give you a very practical reason. Your
code would be rejected at review, and you would be told to fix it and
not do something like that again.

Malcolm McLean · Jan 29, 2008

fnegroni said:
So why is it that a standard header never includes another standard
header?

Why should mylib.h, a third party file which I should not have to
modify, include <stdio.h> just because it uses FILE ?
Surely I can supply my own header which defines a compatible but
otherwise different implementation of FILE * that satisfies the
requirements of mylib (and mylib.h). Why should mylib.h override my
definition of FILE ?

Version one of mylib.h does no IO. Someone decides to provide load / saves
for version 2, and the functions take FILE * parameters. So your code could
easily break.
If you start defining your own FILE type I suggest we are very rapidly going
into the woods. However the whole dependency problem is very difficult.

We expect savecomplex() and loadcomplex() to be in complex.h. On the other
hand, we expect the complex routines to be independent of the IO we use.

Flash Gordon · Jan 29, 2008

cr88192 wrote, On 29/01/08 01:14:

usually, this is a minor problem (rarely has this been a problem in
practice).

I've seen this happen. The larger the application and the longer it is
under maintenance the more likely it is that the person who finds the
bug will not know the code has been duplicated.

whatever dude...

it is not like we are a bunch of monkeys bashing on a keyboard.
theoretically we have enough brains to know just what the hell we wrote...

<shrug> I have had to go back to code years after the last time anyone
looked at it. There is also a good chance that I was not the original
author, or if I was the author there is a good chance someone else will
have to look at it (I wrote code in the 80s that I heard in the late 90s
was planned to be in use for more than another decade, and I've since
left the company).

it introduces increased cost in terms of dependency management, which IS
a problem in practice.

I use tools and documentation to manage dependancies and don't find it a
problem.

of course, a matter of scale is also involved (I am thinking more in the
hundreds of kloc range).

Well, I've not had to work on projects that size, so perhaps the balance
changes.

not often the case as I have seen with 3rd party libraries.

Well, since this was talking about whether to split the code out in to a
separate library or not I don't believe people were talking about 3rd
party libraries.

also, note that, when codebases get large, it is a major hassle to drag
around much more than pieces of previous projects.

grab pieces, tack them together, make something that works.

I would have said that splitting it apart in to libraries makes that easier.

we don't want to drag in 500kloc or 1Mloc of code just to get some
simple app working, it is better just to grab what is needed. this
means, keeping some level of control over dependency issues.

If the library is only 10kloc and depends on another library which is
5kloc then that is only 15kloc you have to consider. You might still
decide to extract the code from the original library if you only want a
small subset of it, and I don't see how having it as a library would
make it any harder than finding all the correct parts from amongst 10kloc.

it is also much worse, for example, when a bug or change in one part of
the codebase, due to uncontrolled dependencies, manages to disrupt the
operation of many of the other subsystems (or leaves the pain of having
to rewrite a part of the codebase to break up these dependencies).

Or use tools to identify what parts depend on it.

of course, this only assumes the people have read the documentation.

Ah, so you assume that if your 100kloc is not split in to libraries then
people will remember all of it (you say above "theoretically we have
enough brains to know just what the hell we wrote..."), but if it is
they won't remember it and won't read the documentation.

this is especially a problem with building cross-platform apps and libs
(or trying to get most linux apps to build on windows).

it is best to avoid using anything that may not exist on the other arch.

Since the earlier talk was about whether to split a library in to
separate libraries this is a bogus argument. If you have not split it
then you have to port one library with all the code to each port of
interest, if you have then you have to port the same amount of code
except it is in two libraries. The only difference is that for another
reason you might already have ported some of it.

Richard Tobin · Jan 29, 2008

fnegroni said:
Surely I can supply my own header which defines a compatible but
otherwise different implementation of FILE * that satisfies the
requirements of mylib (and mylib.h).

Not in standard C you can't.

Why should mylib.h override my definition of FILE ?

If you're not concerned about sticking to the letter of standard C,
you can almost certianly arrange for mylib.h's #include <stdio.h>
to get your version.

-- Richard

fnegroni · Jan 29, 2008

If you worked here, I would give you a very practical reason. Your
code would be rejected at review, and you would be told to fix it and
not do something like that again.

Sure.
Next?

Ian Collins · Jan 29, 2008

Malcolm said:
Version one of mylib.h does no IO. Someone decides to provide load /
saves for version 2, and the functions take FILE * parameters. So your
code could easily break.

Indeed, imagine the arse ache if mylib.h is included in scores or even
hundreds of files, all of which have to be updated...

Al Balmer · Jan 29, 2008

Sure.
Next?

There wouldn't be a next ;-)

John Bode · Jan 29, 2008

[snippage]

There's a little dance involving #ifdef's that can prevent a
file being read twice, but it's usually done wrong in practice - the
#ifdef's are in the file itself, not the file that includes it. The
result is often thousands of needless lines of code passing through
the lexical analyzer, which is (in good compilers) the most expensive
phase.

I will *happily* trade longer build times for less work. *Happily*.
First of all, in practice, whatever time this adds to a build is not
that big a deal. IME, it's the link phase that's the most time-
consuming aspect of a build, not lexical analysis. Secondly, making
the header files idempotent eliminates stupid errors like not
including files in the right order (or forgetting to include a file
altogether).

I won't deny this was issue in 1989 (I worked on one Encore mini where
it took on the order of 6 hours to rebuild the world, although that
was Ada, not C), but I like to think we've made *some* progress over
the last couple of decades.

Richard Tobin · Jan 29, 2008

fnegroni said:
The
result is often thousands of needless lines of code passing through
the lexical analyzer, which is (in good compilers) the most expensive
phase.

Er no. This might have been true 25 years ago, but it isn't now.
To achieve good code on modern processors requires a lot of work.

Please show me this "good compiler" where lexical analysis is the
a significant proportion of the compile time. (Hint: if it's true,
the compiler will be blindingly fast.)

-- Richard

fnegroni · Jan 29, 2008

Indeed, imagine the arse ache if mylib.h is included in scores or even

hundreds of files, all of which have to be updated...

And you would do that manually? Ever heard of sed?
This is a recurring issue even with header files that include other
header files, where your code compiled even if you forgot to include a
file yourself, cause it was included by some other include, and no
warning was flagged up, and then that include file changed, no longer
including the include file you thought you did.
Anyway, it is a matter of opinion, since the standard does not impose
a rule on us, and it is part of what people call "coding practice".
Which is just a custom.
Considering how many wars have been started debating the merits/
demerits of curly bracket positioning and tab alignment or spaces, I
think C programmers tend to believe there is only one way of doing
things, and never move on.
If that was the case, many programs we all know (e.g. linux) would not
exist since they explicitly use some extension of the gcc compiler to
achieve their goals.

I am not posting here to tell *you* what to do (unlike some of you
telling *me* what to do. Heck, if you still are doing code reviews, it
is time to move on in your career).

I am just saying there are people who beg to differ in their opinion,
and those opinions have their merits.
You disagree with me, fine. Just don't tell me to shut up or get a job
somewhere else, because that just sounds to me like you just need a
fix.

fnegroni · Jan 29, 2008

I won't deny this was issue in 1989 (I worked on one Encore mini where
it took on the order of 6 hours to rebuild the world, although that
was Ada, not C), but I like to think we've made *some* progress over
the last couple of decades.

Building our product on a mere quad core xeon with 4GB ram only takes
a mere 2 hours.
But then we compile our code on 26 platforms, some are even slower...

fnegroni · Jan 29, 2008

Building our product on a mere quad core xeon with 4GB ram only takes
a mere 2 hours.
But then we compile our code on 26 platforms, some are even slower...

BTW, 2 hours if using incremental linking and ccache.

Ian Collins · Jan 29, 2008

*Please* don't quote signatures.

And you would do that manually? Ever heard of sed?

It's still a pain in the arse compared with updating one file.

I am just saying there are people who beg to differ in their opinion,
and those opinions have their merits.

People tend to disagree with you because your arguments are based on
decades old issues which have little relevance to contemporary tools and
machines. For example, with many compiles, including all of your
headers in one "umbrella" header will speed up compilation significantly
through the use of precompiled headers.

You disagree with me, fine. Just don't tell me to shut up or get a job
somewhere else, because that just sounds to me like you just need a
fix.

I didn't, I don't care one way or the other, if a client works one way
I'll do as they ask. But if you are going to argue based on anything
other than personal opinion, at least argue valid points.

David Tiktin · Jan 29, 2008

So why is it that a standard header never includes another
standard header?

I don't agree with the no-including-headers-in-headers view, but
this is a good question which I don't see has been answered. It has
always seemed odd to me that, for instance, NULL is defined in more
than one standard header file. Why wasn't it put in, say, stddef.h
with stddef.h then #included in stdio.h, stdlib.h, etc.? (It seems
to violate the DRY principle, which is something *I* never do ;-)

I'll bet this is covered in P.J. Plauger's Standard C Library, but
my copy is at home. Is it just a historical artifact? Anyone care
to explain?

Dave

Richard Tobin · Jan 30, 2008

fnegroni said:
Building our product on a mere quad core xeon with 4GB ram only takes
a mere 2 hours.

And most of this is lexical analysis, right?

-- Richard

Flash Gordon · Jan 30, 2008

David Tiktin wrote, On 29/01/08 23:58:

I don't agree with the no-including-headers-in-headers view, but
this is a good question which I don't see has been answered. It has
always seemed odd to me that, for instance, NULL is defined in more
than one standard header file. Why wasn't it put in, say, stddef.h
with stddef.h then #included in stdio.h, stdlib.h, etc.? (It seems
to violate the DRY principle, which is something *I* never do ;-)

I'll bet this is covered in P.J. Plauger's Standard C Library, but
my copy is at home. Is it just a historical artifact? Anyone care
to explain?

Where I have looked at system headers they include other (often private)
headers. The standard does not say that a system header cannot include
another header, it says that it must behave as if it does not include
any of the other headers defined by the standard.

Inline functions and warning	7	Jul 7, 2010
Wrong linkage of system functions	2	Mar 5, 2008
static constants are not local to the file they're defined.	4	Mar 1, 2012
How can I view / open / render / display a pdf file with c code?	0	Sep 23, 2023
Thoughts on SQL vs ORM	3	Feb 6, 2013
#include and namespaces	20	Mar 1, 2012
Bounty Request: small sample autoconf-like app/lib	7	Apr 27, 2011
[C#] Extend main interface on child level	0	Aug 31, 2023

Thoughts on file organisation

Malcolm McLean

Al Balmer

Richard Tobin

fnegroni

Al Balmer

Malcolm McLean

Flash Gordon

Richard Tobin

fnegroni

Ian Collins

Al Balmer

John Bode

Richard Tobin

fnegroni

fnegroni

fnegroni

Ian Collins

David Tiktin

Richard Tobin

Flash Gordon

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads