"We don't need no steenking headers!"

Discussion in 'C++' started by Luke Meyers, May 28, 2006.

  1. Luke Meyers

    Luke Meyers Guest

    So, just a little while ago I had this flash of insight. It occurred
    to me that, while of course in general there are very good reasons for
    the conventional two-file header/implementation separation for each C++
    class, there are cases in which this paradigm contributes nothing and
    simply introduces extra boilerplate overhead.

    The particular case I have in mind is CppUnit tests. Each test header
    is only ever included by the corresponding implementation file, never
    by anything else. The implementation file registers itself with the
    test suite, and that's all there is to it. So, I can't think of any
    reason at all for there to be two files.

    As soon as I thought of this, I asked myself two questions. First, am
    I missing anything? Is there some negative consequence that hasn't
    occurred to me? Second, assuming this isn't a phenomenon unique to
    unit test cases, that there is some general category of classes for
    which the same reasoning applies, what are the properties which
    determine membership in that category?

    I'm interested in hearing others' thoughts on this. My initial
    estimation is that such classes will basically be leaf classes with no
    additional public interface. Furthermore, there are restrictions on
    the circumstances under which one can construct such a class, since no
    code outside Foo.cpp will ever even *see* the symbol Foo. The whole
    thing could be in an anonymous namespace, even! There is a loophole,
    though -- generic code. For example, CppUnit's AutoRegisterSuite is a
    template which takes the test class as its type parameter, and
    instantiates it. CRTP-based designs would function similarly.

    So... I don't have any dazzling conclusions. I think having half as
    many source files to juggle seems like a worthwhile thing in terms of
    comprehensibility/maintenance, even if it only takes place in
    restricted domains. I don't really see wanting to change my design to
    accommodate such a nicety, but it's something I'll be mulling over.

    Anyway, as I said... thoughts?

    Luke
    Luke Meyers, May 28, 2006
    #1
    1. Advertising

  2. Luke Meyers

    Ian Collins Guest

    Luke Meyers wrote:
    > So, just a little while ago I had this flash of insight. It occurred
    > to me that, while of course in general there are very good reasons for
    > the conventional two-file header/implementation separation for each C++
    > class, there are cases in which this paradigm contributes nothing and
    > simply introduces extra boilerplate overhead.
    >
    > The particular case I have in mind is CppUnit tests. Each test header
    > is only ever included by the corresponding implementation file, never
    > by anything else. The implementation file registers itself with the
    > test suite, and that's all there is to it. So, I can't think of any
    > reason at all for there to be two files.
    >

    That depend where you build your test suites or runners. My CppUnit
    headers get include in at least two source files.

    > As soon as I thought of this, I asked myself two questions. First, am
    > I missing anything? Is there some negative consequence that hasn't
    > occurred to me? Second, assuming this isn't a phenomenon unique to
    > unit test cases, that there is some general category of classes for
    > which the same reasoning applies, what are the properties which
    > determine membership in that category?
    >

    Implementation classes when one is using the PIMPL idiom?

    > I'm interested in hearing others' thoughts on this. My initial
    > estimation is that such classes will basically be leaf classes with no
    > additional public interface. Furthermore, there are restrictions on
    > the circumstances under which one can construct such a class, since no
    > code outside Foo.cpp will ever even *see* the symbol Foo. The whole
    > thing could be in an anonymous namespace, even! There is a loophole,
    > though -- generic code. For example, CppUnit's AutoRegisterSuite is a
    > template which takes the test class as its type parameter, and
    > instantiates it. CRTP-based designs would function similarly.
    >

    Biggest problem I can see is that you are buggered if one of these
    classes stops being a leaf. Also if you are using CppUnit, you may want
    to include the private header in the test file.

    > So... I don't have any dazzling conclusions. I think having half as
    > many source files to juggle seems like a worthwhile thing in terms of
    > comprehensibility/maintenance, even if it only takes place in
    > restricted domains. I don't really see wanting to change my design to
    > accommodate such a nicety, but it's something I'll be mulling over.
    >

    Well I guess you could put all the code in the headers (like some
    compilers require you do do with templates) and include them all in one
    source file for compilation. No pretty!


    --
    Ian Collins.
    Ian Collins, May 28, 2006
    #2
    1. Advertising

  3. Luke Meyers

    Luke Meyers Guest

    Ian Collins wrote:
    > Luke Meyers wrote:
    > > The particular case I have in mind is CppUnit tests. Each test header
    > > is only ever included by the corresponding implementation file, never
    > > by anything else. The implementation file registers itself with the
    > > test suite, and that's all there is to it. So, I can't think of any
    > > reason at all for there to be two files.
    > >

    > That depend where you build your test suites or runners. My CppUnit
    > headers get include in at least two source files.


    Hmm, really? I'm curious what that compilation structure looks like,
    and whether you feel that it's advantageous. I have a single
    TestMain.cpp per suite which builds the runner and uses test registry
    magic (read: hidden globals) to discover all the tests which have,
    privately within their own .cpp files. So, no need for anybody to
    include FooTest.h.

    > Implementation classes when one is using the PIMPL idiom?


    Hmm, certainly a well-known example of a class with no header. I think
    this is a different sort of animal, as are for example little functor
    structs and other helpers. They're clearly subordinate, little
    different from inner classes. The dominant class in such cases still
    generally has its own header.

    > Biggest problem I can see is that you are buggered if one of these
    > classes stops being a leaf.


    I think that's a pretty mild buggering -- all you'd wind up doing is
    splitting one file into two, no real headache involved. Leaf classes
    which become base classes pretty much always require modification,
    anyway -- introduction of protected members, virtual member functions,
    and such. Otherwise why inherit?

    > Also if you are using CppUnit, you may want
    > to include the private header in the test file.


    Could you explain what you mean?

    > Well I guess you could put all the code in the headers (like some
    > compilers require you do do with templates) and include them all in one
    > source file for compilation. No pretty!


    Yes, but this carries some pretty well-known drawbacks -- namely,
    having to recompile downstream classes every time your implementation
    changes, as opposed to only when the interface changes. Also, people
    and tools generally expect not to compile header files directly, so
    that could lead to headaches. Anyway, this doesn't really have much in
    common with the .cpp-only approach I described.

    Oh, and there are some other solutions to the template instantiation
    problem, by the way.

    I recall a while ago someone posted on here a link to an essay about
    "Java-style classes in C++" which proposed yet another unconventional
    compilation structure. I don't recall being particularly convinced,
    but it was food for thought. Anyone else got any interesting (ab?)uses
    of the C++ compilation model?

    Luke
    Luke Meyers, May 28, 2006
    #3
  4. Luke Meyers

    Ian Collins Guest

    Luke Meyers wrote:
    > Ian Collins wrote:
    >
    >>Luke Meyers wrote:
    >>
    >>>The particular case I have in mind is CppUnit tests. Each test header
    >>>is only ever included by the corresponding implementation file, never
    >>>by anything else. The implementation file registers itself with the
    >>>test suite, and that's all there is to it. So, I can't think of any
    >>>reason at all for there to be two files.
    >>>

    >>
    >>That depend where you build your test suites or runners. My CppUnit
    >>headers get include in at least two source files.

    >
    >
    > Hmm, really? I'm curious what that compilation structure looks like,
    > and whether you feel that it's advantageous. I have a single
    > TestMain.cpp per suite which builds the runner and uses test registry
    > magic (read: hidden globals) to discover all the tests which have,
    > privately within their own .cpp files. So, no need for anybody to
    > include FooTest.h.
    >

    I haven't tried that, maybe I should...
    >
    >>Implementation classes when one is using the PIMPL idiom?

    >
    >
    >>Also if you are using CppUnit, you may want
    >>to include the private header in the test file.

    >
    >
    > Could you explain what you mean?
    >

    How to you unit test as leaf class?
    >
    >
    > Oh, and there are some other solutions to the template instantiation
    > problem, by the way.
    >

    Use a compiler that doesn't suffer from it :)

    > I recall a while ago someone posted on here a link to an essay about
    > "Java-style classes in C++" which proposed yet another unconventional
    > compilation structure. I don't recall being particularly convinced,
    > but it was food for thought. Anyone else got any interesting (ab?)uses
    > of the C++ compilation model?
    >

    Well it's not an abuse, but if you are looking to simplify things -
    seeing as some compilers can find template definitions in source files,
    why can't C++ compilers use search rules to find header files when they
    encounter a class? If you're not sure what I'm on about, google for
    "php autoload".

    --
    Ian Collins.
    Ian Collins, May 28, 2006
    #4
  5. Luke Meyers

    Phlip Guest

    Luke Meyers wrote:

    > The particular case I have in mind is CppUnit tests. Each test header
    > is only ever included by the corresponding implementation file, never
    > by anything else.


    That might be an architectural flaw of CppUnit. Rigs like CppUnitLite, using
    a TEST_() macro, don't even need a header.

    > The implementation file registers itself with the
    > test suite, and that's all there is to it. So, I can't think of any
    > reason at all for there to be two files.


    If it's not an architectural flaw, then why is the header there? It may just
    be a flaw of the sample code. Take it out if you don't need it.

    Sometimes test suites inherit test suites, particularily to follow the
    Abstract Test Pattern. Those cases need headers - if the derived suites are
    indeed in separate files. There only reason I can think not to put them in
    the same file is file length.

    > As soon as I thought of this, I asked myself two questions. First, am
    > I missing anything? Is there some negative consequence that hasn't
    > occurred to me? Second, assuming this isn't a phenomenon unique to
    > unit test cases, that there is some general category of classes for
    > which the same reasoning applies, what are the properties which
    > determine membership in that category?


    Huh? If you don't share a class between modules, don't put it in a header.
    (Someone aware of CppUnit ought to recognize the Refactoring
    implications...)

    > I'm interested in hearing others' thoughts on this.


    Try this. Two teams start with two hot-head C++ gurus for team leads. One
    decrees:

    A. put all method bodies inside their classes
    unless profiling reveals you should take
    them out

    B. put all method bodies outside their classes
    unless profiling reveals you should put
    them in

    One is default inline and the other default out-of-line.

    Neither boss says to put all classes in .h files (or to put only one class
    in each .h file, or anything retarded like that).

    Team A's code resembles Java, yet as profiling of runtime and of compile
    time reveals bottlenecks, some methods migrate out of their classes, and
    into .cpp files.

    The reason most teams pick option B is because all their legacy code goes
    like that. I suspect that either systems are sustainable. System B has much
    less paperwork. And programs that abuse templates tend to attract system A.

    --
    Phlip
    http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
    Phlip, May 28, 2006
    #5
  6. Luke Meyers

    Phlip Guest

    Ian Collins wrote:

    > That depend where you build your test suites or runners. My CppUnit
    > headers get include in at least two source files.


    Why? If it's for registering tests, you can write a TEST_() macro in raw
    CppUnit, and never register again...

    Luke Meyers wrote:

    > ...I have a single
    > TestMain.cpp per suite which builds the runner and uses test registry
    > magic (read: hidden globals)


    If it's hidden (a good thing), then it ain't a global (a bad thing!).

    > Implementation classes when one is using the PIMPL idiom?


    Why would one Pimpl test suites that are already discrete and discreet?

    --
    Phlip
    http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
    Phlip, May 28, 2006
    #6
  7. Luke Meyers

    Luke Meyers Guest

    Ian Collins wrote:
    > How to you unit test as leaf class?


    Assuming this is a typo for "how do you unit test a leaf class," I
    think you've hit upon the major flaw/limitation in this approach.
    Every class which one is interested in unit-testing (which should be
    basically every class) has at least one class which is interested in
    including the header for that class so as to use it. Unless one does
    something funky like put the test code and production code in the same
    compilation unit. Either by putting them both in the same literal
    file, or by #including one implementation file from the other (possibly
    with an #if TESTING guard around the #include, going in one direction).

    Any other ideas to get around this limitation?

    > > Oh, and there are some other solutions to the template instantiation
    > > problem, by the way.
    > >

    > Use a compiler that doesn't suffer from it :)


    That's one. Individual programmers on a team/within a company are
    frequently not entirely at liberty to simply choose whatever compiler
    they like for production code, though. But the FAQ mentions some other
    options, which I'm sure you've read, and which I've used (with minor
    variants of my own devising) to good effect. The practice of
    #including the implementation files for templates has some interesting
    consequences, like enabling easy control of the size and number of
    one's compilation units.

    > Well it's not an abuse, but if you are looking to simplify things -
    > seeing as some compilers can find template definitions in source files,


    You are referring to the "export" keyword, right?

    > why can't C++ compilers use search rules to find header files when they
    > encounter a class? If you're not sure what I'm on about, google for
    > "php autoload".


    Well, I don't think this has very much to do with template exporting,
    but it could be readily accomplished (modulo a reasonable approach to
    disambiguation) with an added preprocessor step. Use the scripting
    language of your choice. I had a peek at php autoload -- doesn't look
    like a model that would be doable with the C++ preprocessor as-is.

    Luke
    Luke Meyers, May 28, 2006
    #7
  8. Luke Meyers

    Phlip Guest

    Luke Meyers wrote:

    > Every class which one is interested in unit-testing (which should be
    > basically every class) has at least one class which is interested in
    > including the header for that class so as to use it. Unless one does
    > something funky like put the test code and production code in the same
    > compilation unit. Either by putting them both in the same literal
    > file, or by #including one implementation file from the other (possibly
    > with an #if TESTING guard around the #include, going in one direction).
    >
    > Any other ideas to get around this limitation?


    Under pure Test Driven Development, you write the client interface you need,
    and write unstructured behavior behind that interface, to pass the test.
    Then you refactor, frequently testing, until the behavior is structured. So
    the behavior could migrate into private classes at file scope, or less, and
    it's all still perfectly tested.

    Refactoring shouldn't change behavior. You should only do tiny refactors
    that you know won't change behavior (including incidental behavior, such as
    which order to call a sequence of functions that don't influence each
    other). And your test cases will preserve the behavior that you drew up to
    the interface.

    So the result should be well-tested and well-encapsulated behavior.

    --
    Phlip
    http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
    Phlip, May 28, 2006
    #8
  9. Luke Meyers

    Luke Meyers Guest

    Phlip wrote:
    > Luke Meyers wrote:
    > > ...I have a single
    > > TestMain.cpp per suite which builds the runner and uses test registry
    > > magic (read: hidden globals)

    >
    > If it's hidden (a good thing), then it ain't a global (a bad thing!).


    I appreciate when people interpret my words a little less rigidly than
    this. The global-ness is hidden behind macros and statics and such,
    depending on which interface one uses. That's just the way CppUnit
    test registries work. All I do is include in each test source file a
    line like:

    CppUnit::AutoRegisterSuite<TestFoo> suite("moduleName");

    Usually I put it in an anonymous namespace, too, because I am that kind
    of bear.

    Luke
    Luke Meyers, May 28, 2006
    #9
  10. Luke Meyers

    Luke Meyers Guest

    Phlip wrote:
    > Luke Meyers wrote:
    > > The particular case I have in mind is CppUnit tests. Each test header
    > > is only ever included by the corresponding implementation file, never
    > > by anything else.

    >
    > That might be an architectural flaw of CppUnit. Rigs like CppUnitLite, using
    > a TEST_() macro, don't even need a header.


    CppUnit doesn't need a header either -- that's my point. My
    realization was that I was using the header separation model simply out
    of habit, rather than expedience.

    > Huh? If you don't share a class between modules, don't put it in a header.


    What about classes within the same module which use each other? How do
    you provide them access to each other's definitions without headers,
    unless putting them all in the same source file? Is this a confusion
    over the word "module?"

    > (Someone aware of CppUnit ought to recognize the Refactoring
    > implications...)


    I'm aware that, depending on how the notion of "module" works in my
    particular build structure, I may have the option to narrow my public
    interface by only making a carefully-selected subset of my header files
    visible to other modules.

    > Try this. Two teams start with two hot-head C++ gurus for team leads. One
    > decrees:
    >
    > A. put all method bodies inside their classes
    > unless profiling reveals you should take
    > them out
    >
    > B. put all method bodies outside their classes
    > unless profiling reveals you should put
    > them in
    >
    > One is default inline and the other default out-of-line.
    >
    > Neither boss says to put all classes in .h files (or to put only one class
    > in each .h file, or anything retarded like that).
    >
    > Team A's code resembles Java, yet as profiling of runtime and of compile
    > time reveals bottlenecks, some methods migrate out of their classes, and
    > into .cpp files.
    >
    > The reason most teams pick option B is because all their legacy code goes
    > like that. I suspect that either systems are sustainable. System B has much
    > less paperwork. And programs that abuse templates tend to attract system A.


    It's preposterous to levy runtime performance as the only
    consideration. I see no reason to suspect a strong correllation one
    way or the other between either strategy and runtime performance. The
    chief reason to separate class definitions from implementations, as I
    see it, is to create a recompilation firewall. Also, while translation
    unit size isn't as big of a concern as it once was, putting
    implementation in headers does mean bloating each unit with all those
    function definitions. No reason to make the compiler chew on all that
    again and again.

    Luke
    Luke Meyers, May 28, 2006
    #10
  11. Luke Meyers

    Phlip Guest

    Luke Meyers wrote:

    >> Huh? If you don't share a class between modules, don't put it in a
    >> header.

    >
    > What about classes within the same module which use each other? How do
    > you provide them access to each other's definitions without headers,
    > unless putting them all in the same source file? Is this a confusion
    > over the word "module?"


    Yes. Don't interpret it so rigidly. The C++ Standard has no definition of
    module. (And in the other post I thought _you_ were disparaging globals. No
    biggie.)

    In this case, it means a translation unit. And sometimes modules are
    clusters of translation units.

    > It's preposterous to levy runtime performance as the only
    > consideration.


    I also mentioned recompile times.

    However, I was quoting team leads like James Kanze, who tell their minions
    to make _everything_ out-of-line unless profiling reveals it should go
    inline. Naturally that errs on the side of compile time, but that's not why
    he does it.

    As a thought experiment, one could grow a project using the opposite rule -
    put everything inside a class unless profiling (rebuild times or run times)
    reveals it should go out-of-line.

    http://c2.com/cgi/wiki?CppHeresy hence
    http://c2.com/cgi/wiki?InlineAllMethodsWhereverPossible

    (I had no idea the second page was there. Feel free to ignore the first one
    it's obviously just PeterMerel...)

    So James Kanze orders his minions to out-of-line everything as a baseline
    for profiling.

    > I see no reason to suspect a strong correllation one
    > way or the other between either strategy and runtime performance. The
    > chief reason to separate class definitions from implementations, as I
    > see it, is to create a recompilation firewall.


    C++ allows logical encapsulation to parallel physical encapsulation. So if
    the most-frequently-depended-on things are mostly abstract base classes,
    then no matter how you abuse the concrete classes, recompiles don't cascade
    when you change behavior.

    Tragic recompile situations frequently occur in systems that ramble on and
    on. Suppose a team of 20 started coding in pure C++, maybe with no tests,
    and added lines for a few years. Now the system is huge, a lead programmer
    inserted Pimpls as an emergency defense, and the link time is outrageous.
    Just as bad as an InlineAllMethodsWhereverPossible project.

    > Also, while translation
    > unit size isn't as big of a concern as it once was, putting
    > implementation in headers does mean bloating each unit with all those
    > function definitions. No reason to make the compiler chew on all that
    > again and again.


    In this case, the system should have followed "encapsulation is
    hierarchical". That means the longer the logical distance between two
    elements, the narrower the physical channel between them. The CppHeresy page
    recommended breaking things up into modules by using Python as the glue
    between them. So now link times are healthy because your soft layers
    (Python, an ORB, whatever) defer the high-level linking to run-time.

    No steeking headers.

    --
    Phlip
    http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!
    Phlip, May 28, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page