Nick Keighley said:
then there almost certainly is some fat (unless you're a really clever
coder!)
yeah.
well, I think there is as well, so no issue there...
I don't really understand the question. Is your code base causing you
a problem? You are presumably adding code for a reason (unless you
just like coding) so, er, what was the question again?
well, usually my reason for adding code is not to deal with problems...
more often, it is adding code to add "features"...
sometimes though, the features turn out to be almost entirely useless.
I once added a partial "precise" mode to my (otherwise conservative) GC, but
ended up not using it (because precise GC makes using it a lot more
difficult for not much gain). later on, this is no longer part of the public
API. internally, some of the code lingers on, and a cleanup of this
component would likely eliminate this.
the GC also does ref-counting, which is another one of those "rarely used"
features, mostly because ref-counting is one of those "all or nothing"
features, meaning that any code which uses the feature has to be entirely
ref-counting safe.
then, elsewhere, there is another precise GC, which was written because I
realized that it was kind of pointless to have a precise GC on the same heap
as a conservative one if they end up essentially "splitting the world in
half" anyways.
in this case, I had intended this other GC mostly for a particular use, but
ended up using a different MM strategy for that code instead: allocating
lots of memory in a temporary heap, and then destroying this heap when done.
this was done as an attempt to improve both stability and performance.
though not very generic, it works fairly well for code which produces 10s of
MB of stuff in a fraction of a second but never needs to refer to it again
after the task completes. it works much better than using a generic GC for
this task, since this usage pattern is essentially "abusive" to the GC
(tending to cause misbehavior and poor performance).
and, it is all this fuss over a single set of features.
removing real dead code seems a good idea. If-it-might-com-in-useful-
one-day (vanishingly unlikely really) then you can get it out of your
repository.
well, this has happened sometimes, but it often takes a while for features
to "become" useful.
but, yeah, a good start is removing dead components and subsystems, a few of
which I know to exist.
(particularly related to my codegen).
although, I had made a new experimental codegen with a nifty register
allocator which was never really migrated back into the main codegen (mostly
as the new codegen worked a bit different, and my old codegen is a big
tangled mess that has been hacked on a lot).
it is just that the old codegen has proven a bit difficult to replace.
well that's your call. Little used might be very useful when you need
it! The US's SDI system was only supposed to be used once (or less)...
Error handling often doesn't get used very often.
ok.
I don't see the code increasing in size as inherently wrong. I'm more
interested in *why* it is growing.
partly as a result of adding features;
partly as a result of "generalizing" things;
....
all sounds good (though the "cellular" term seems a little odd)
well, because originally, one has a lot of code which is in a single
component.
so, all the code shares the same directory, makefile, and naming prefixes,
....
but, then it gets too large, and is split.
then, often, the library name prefix gets changed, as well as the sub-parts
being moved into new directories, ...
say:
FOO_SubSysA_...
FOO_SusSysB_...
FOO_SusSysC_...
FOO_SusSysD_...
FOO_SusSysE_...
FOO_SusSysF_...
splits into:
FOO_SubSysA_...
FOO_SusSysC_...
FOO_SusSysE_...
FOO_SusSysF_...
BAR_SusSysB_...
BAR_SusSysD_...
then, maybe some internal patch-up is done to compensate for the change in
naming, ...
(this is often done via 'sed' or search/replace).
so, it a way, it is sort of like mitosis or similar...
really? I thought it would help
component splitting very often causes there to be sub-components which are
larger than the original singular component.
this is usually the result of "abstracting" one component from another,
which often adds new code in the form of abstract API wrappers, ...
oh? I'd have thought good abstractions would keep the code size down.
on the larger scale, probably.
but at the small scale, it adds a bunch of function calls which often do
little more than redirect to other functions.
in a few components, this can end up being a significant part of the overall
code size (in particular, in one of the larger components, which consists
almost entirely of exported APIs and relatively little internal logic code).
most other components contain more of a balance, with most of the size being
due to logic-code, and a smaller amount of wrapper code usually serving to
serve as an interface to the outside world.
I misunderstood you here. You were talking header kloc and c file
kloc. I'd expect there to be a big difference but I've never done any
measurements (never really cared taht much to be honest!)
yep.
well, I measured a lot of stuff, some of which I didn't bother to mention.
or, at least by my standards where something like:
"if(*(float *)((char *)(&array_of_structs[index])+offset) == foo) ..."
is, IMO, nasty...
fairly nasty. But does it really matter? If you need to do stuff like
that isolate it in a function/module/macro and document it. The rest
of the code doesn't care how nasty it is.
yep, I usually avoid doing stuff like this personally, or if it is done, it
is wrapped up somewhat...
Q2 does things like this often as a matter of common practice.
as well as the good old trick of reading raw data from a file, casting it to
a struct pointer, and just using this structure directly (although often
with either endianess-swap functions, or a pre-pass to go over the read-in
file and pre-swap all the values if needed).
I often use more explicit read/write value operations, such as reading a
datum at a time.
Foo_Vertex *Foo_ReadVertex(VFILE *fd)
{
Foo_Vertex *tmp;
tmp=Foo_AllocVertex();
tmp->x=Foo_ReadFloat(fd);
tmp->y=Foo_ReadFloat(fd);
tmp->z=Foo_ReadFloat(fd);
return(tmp);
}
Foo_Triangle *Foo_ReadTriangle(VFILE *fd)
{
Foo_Triangle *tmp;
tmp=Foo_AllocTriangle();
tmp->v0=Foo_ReadVertex(fd);
tmp->v1=Foo_ReadVertex(fd);
tmp->v2=Foo_ReadVertex(fd);
return(tmp);
}
....