object file size is reduced after build

J

jayapal

Hi all,


We have large code on which we are solving the bugs. For every bug we
change the code part either add or delete some of the code part. More
is adding the code part and very less is deleting the code.

But the question is , after build the object file( changed code) size
is less than the object file ( unchanged code)???. As we increased
the lines of code the object file size should be increased but here it
is decresing .. why?

Is these issue with the optimization? as we are doing with
optimization 2.


Thanks in advance,
jayapal
 
G

Gordon Burditt

We have large code on which we are solving the bugs. For every bug we
change the code part either add or delete some of the code part. More
is adding the code part and very less is deleting the code.

But the question is , after build the object file( changed code) size
is less than the object file ( unchanged code)???. As we increased
the lines of code the object file size should be increased but here it
is decresing .. why?

The "line of code" is an incredibly silly measure of code complexity.
Except for preprocessor directives, many compilers will accept
putting all of the code on one line. (Although the compiler is not
required do accept long lines, many do anyway).

Do not calculate the number of lines of code, especially not where
management can see it.
 
K

Kaz Kylheku

Hi all,

We have large code on which we are solving the bugs. For every bug we
change the code part either add or delete some of the code part. More
is adding the code part and very less is deleting the code.

But the question is , after build the object file( changed code) size
is less than the object file ( unchanged code)???. As we increased
the lines of code the object file size should be increased but here it
is decresing .. why?


I can add two lines of code that can drastically cut down the size of
an object file, provided that the compiler eliminates unreachable
basic blocks properly:

if (0) {

and, lower down:

}

:)


Maybe the code you are writing contains a lot of redundancy that the
compiler is able to recognize and eliminate. E.g. suppose you have two
functions that translate to exactly the same machine code. The
compiler can detect that and merge them into one function. (At the
object file level, one function can still be given two names).

This kind of squeezing could be done at the basic block level also,
not just one whole functions. Suppose you have a construct like:

if (condition()) {
S1;
} else {
S2;
}

S1 and S2 are different statements. Suppose you add something to S1
which makes it equivalent to S2. It means that the same logic is then
executed regardless of which way the condition goes:

if (condition()) {
S2;
} else {
S2;
}

The compiler could recognize the situation and simply reorganize the
code to:

condition(); /* necessary to call for any side effects */
S2;

And so adding lines to S1 which made it the same as S2 actually caused
the program to shrink.
 
J

James Kuyper

Gordon Burditt wrote:
....
The "line of code" is an incredibly silly measure of code complexity.

OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?
 
R

Richard Heathfield

James Kuyper said:
Gordon Burditt wrote:
...

OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.

That is unreasonable. If we want even a slight improvement on LOC, we
should be prepared to pay a little something for that improvement. Not
much, but a little. See below.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

LOC is basically a count of newlines. Counting semicolons would be a
marginal improvement (for C code, anyway), and is just as quick to
calculate (but, unlike LOC, it does mean you'll have to cut some code).

I would argue that a semicolon count would approximate the complexity far
more accurately than a newline count (although of course it is still
pitifully inadequate).
 
C

CBFalconer

James said:
Gordon Burditt wrote:
...

OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

Ah, obviously the two following snippets have widely differing
complexity:

if (i = foo(baz)) goo(flimdiddle);

and

i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.
 
G

Gordon Burditt

The "line of code" is an incredibly silly measure of code complexity.
OK. So what would you recommend to replace it, subject to the following
constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

How about 42? It's much easier to calculate, and is free of
subjective bias.
 
J

James Kuyper

Gordon said:
How about 42? It's much easier to calculate, and is free of
subjective bias.

No that does not meet constraint 3. Claiming that it does would imply
that LOC is completely and absolutely unrelated to code complexity.
Exaggerations to the contrary notwithstanding, that's simply not true.
 
J

James Kuyper

CBFalconer said:
Ah, obviously the two following snippets have widely differing
complexity:

if (i = foo(baz)) goo(flimdiddle);

and

i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.

I said nothing to suggest that. LOC is a very poor measure of code
complexity, and I never suggested otherwise. I was simply saying that
it's at least as good a measure of complexity as any other objective
quantity that is equally easy to calculate. That's because every measure
that is that easy to calculate is a very poor measure.

I'll concede Richard Heathfield's point that the complexity of C code
can probably be measured somewhat more accurately by a count of
semicolons than by a count of newline characters, which is obviously
equally easy to calculate (but applicable only to C code). However, it's
only a small increment in accuracy. The only way to significantly
improve on LOC as a measure requires a much more complicated algorithm.
 
R

Richard Heathfield

James Kuyper said:

I'll concede Richard Heathfield's point that the complexity of C code
can probably be measured somewhat more accurately by a count of
semicolons than by a count of newline characters, which is obviously
equally easy to calculate (but applicable only to C code). However, it's
only a small increment in accuracy. The only way to significantly
improve on LOC as a measure requires a much more complicated algorithm.


Here's a further suggestion that, IMHO, significantly improves on LOC
without requiring a great deal of complexity.

(a) pre-process the source - this saves headaches with comments and
conditional compilation and stuff;
(b) now get counting:

* for each semicolon, count 1.
* for each left parenthesis, count 1.
* for each operator, count 1.
* for each instance of 'if', count 1.
* for each instance of 'for' or 'while', count 2.
* for each instance of 'case', count 1.
* for each instance of 'continue', count 1.
* for each instance of 'break', count 1.
* for each instance of 'goto', count 5.
* for each instance of 'setjmp' or 'longjmp', count 10.

(Adjust figures to taste.)

While this is a fair bit more work to implement than "count semicolons",
it's still not too bad, and could be done in a few minutes by any
reasonably competent C programmer.

BTW if you want to know complexity density rather than absolute complexity,
divide by LOC (or perhaps by file size) at the end.
 
B

Ben Bacarisse

James Kuyper said:
Gordon Burditt wrote:
...

OK. So what would you recommend to replace it, subject to the
following constraints:
1) It must be at least as easy to calculate as LOC.
2) It must be at least as free of subjective bias as LOC.
3) It must be at least as accurate a measure of code complexity as LOC.

From what you've said, constraint 3 should be easy to meet; how about
the other two?

Relaxing (1), I'd suggest simply counting constructs that make a
choice. Add 1 for every "if", "while" (including "do while"), "for"
and "switch". One would want to count something (1?) for every
function body so as to encourage lots of decomposition.

To get slightly more complex, I'd be inclined to add one for every
"else" that was not an "else if" and thus also for every switch case
except the first (since the switch already counts one). In such a
scheme I'd count a "while" (and probably a "for") as 2. You'd need to
take a view on "&&" and "||" which are "if"s in all but name.

I once write a program did something like this for student programs (I
wanted to encourage simple solutions) but that had a "compounding"
metric: each construct had a score >1 and nesting multiplied the
scores. Functions (as a syntactic form) were "free", of course, but
they carried the score of their "insides".
 
J

James Kuyper

Ben said:
Relaxing (1), I'd suggest simply counting constructs that make a
choice. Add 1 for every "if", "while" (including "do while"), "for"
and "switch". One would want to count something (1?) for every
function body so as to encourage lots of decomposition.

To get slightly more complex, I'd be inclined to add one for every
"else" that was not an "else if" and thus also for every switch case
except the first (since the switch already counts one). In such a
scheme I'd count a "while" (and probably a "for") as 2. You'd need to
take a view on "&&" and "||" which are "if"s in all but name.

I once write a program did something like this for student programs (I
wanted to encourage simple solutions) but that had a "compounding"
metric: each construct had a score >1 and nesting multiplied the
scores. Functions (as a syntactic form) were "free", of course, but
they carried the score of their "insides".

There's actually a formal metric which is based upon an algorithm
similar to the one you describe. I can't remember the name, though the
word "Cocomo" comes up when I'm thinking about it.

I was required by my company to take a course in software estimation
which explained such things. I found it less than useful. The techniques
they taught required data collection that took a lot more time than I
could easily afford. They required collecting of enough data to
calibrate the coefficients in the (assumed-to-be linear) relationship
between code complexity and development time. They had the built-in
assumption that a single programmer with a known constant productivity
would be assigned to a given task without interruption until that task
was complete.

They required me to know so much about a program to estimate it's
complexity, that the program had to be almost completely written before
I could do the calculation needed to estimate how long it would take to
write; at that point, multiplying the currently expended time by 1.5
would have been a far more accurate estimate, and much easier to calculate.

Those assumptions disagreed with so many different aspects of the
reality of my group that I've never found those techniques useful. I
suspect that those techniques were intended to be used at an
organizational level significantly higher than mine (I currently manage
1.5 other people; before layoffs mandated by NASA budget cuts I managed
a maximum of 2.5 people).
 
S

santosh

Ben said:
Relaxing (1), I'd suggest simply counting constructs that make a
choice. Add 1 for every "if", "while" (including "do while"), "for"
and "switch". One would want to count something (1?) for every
function body so as to encourage lots of decomposition.

To get slightly more complex, I'd be inclined to add one for every
"else" that was not an "else if" and thus also for every switch case
except the first (since the switch already counts one). In such a
scheme I'd count a "while" (and probably a "for") as 2. You'd need to
take a view on "&&" and "||" which are "if"s in all but name.

I once write a program did something like this for student programs (I
wanted to encourage simple solutions) but that had a "compounding"
metric: each construct had a score >1 and nesting multiplied the
scores. Functions (as a syntactic form) were "free", of course, but
they carried the score of their "insides".

Interesting! Do goto break etc., count as a negative score?
 
P

pete

Richard said:
James Kuyper said:



Here's a further suggestion that, IMHO, significantly improves on LOC
without requiring a great deal of complexity.

(a) pre-process the source - this saves headaches with comments and
conditional compilation and stuff;
(b) now get counting:

* for each semicolon, count 1.
* for each left parenthesis, count 1.
* for each operator, count 1.
* for each instance of 'if', count 1.
* for each instance of 'for' or 'while', count 2.
* for each instance of 'case', count 1.
* for each instance of 'continue', count 1.
* for each instance of 'break', count 1.
* for each instance of 'goto', count 5.
* for each instance of 'setjmp' or 'longjmp', count 10.

(Adjust figures to taste.)

While this is a fair bit more work
to implement than "count semicolons",
it's still not too bad, and could be done in a few minutes by any
reasonably competent C programmer.

BTW if you want to know complexity density
rather than absolute complexity,
divide by LOC (or perhaps by file size) at the end.

I like the semicolon count.

This construct:

do {
} while(--count != 0);

translates into only two opcodes in Microchip PIC assembly,
and that has a very reduced instruction set.

(1)(decrement, skip next instruction if result is zero)
(2)(jump to loop start)
 
R

Richard Heathfield

pete said:

This construct:

do {
} while(--count != 0);

translates into only two opcodes in Microchip PIC assembly,
and that has a very reduced instruction set.

(1)(decrement, skip next instruction if result is zero)
(2)(jump to loop start)

....which is a bit silly, since it should reduce to one: MOV count, 0

In any case, target language complexity is generally not the issue. What
we're interested in is how complex the source code is, and:

do {
} while(--count != 0);

is considerably more complex than:

count = 0;

Using the rough n' ready guide I posted earlier, your code would score one
for the semicolon, one for the left paren, one for --, one for !=, and two
for 'while', making a score of 6 for that fragment, compared to 2 for
count = 0; - and that seems to me to be a reasonable reflection of the
added source level complexity of the (pointless) do-loop.
 
P

pete

Richard said:
pete said:



...which is a bit silly, since it should reduce to one: MOV count, 0

For possibley large initial values of count:

do {
/*
** This comment is meant to
** represent an arbitrary amount of useful code
*/
} while(--count != 0);

the non commented portion of what could actually be useful code,
translates into only two opcodes in Microchip PIC assembly.
 
M

Mark McIntyre

CBFalconer said:
Ah, obviously the two following snippets have widely differing
complexity:

if (i = foo(baz)) goo(flimdiddle);

and

i = foo(baz);
if (i)
{
goo(flimdiddle);
}

and the second is worth 5 times as much.

Not a meaningful example. LOC is a statistical measure: like all such
measures, you can't apply it to small population sizes. I _think_ we
can agree a 5 MLOC programme is probably less complex than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your fingers
will start to complain doing that on even a thousand-line programme, let
alone a million-liner...
 
J

jameskuyper

Mark McIntyre wrote:
.....
Not a meaningful example. LOC is a statistical measure: like all such
measures, you can't apply it to small population sizes. I _think_ we
can agree a 5 MLOC programme is probably less complex than a 50 MLOC one?

And sure, you can artificially inflate linecounts - but your fingers
will start to complain doing that on even a thousand-line programme, let
alone a million-liner...

It would be relatively straightforward to modify one of those pretty-C
programs to perform the inflation for you. If you're being paid by the
line, it could be an investment well worth the effort. :)
 
M

Mark McIntyre

Mark McIntyre wrote:
....

It would be relatively straightforward to modify one of those pretty-C
programs to perform the inflation for you. If you're being paid by the
line, it could be an investment well worth the effort. :)

Wouldn't work - everyone in the firm would use it, and the PHBs would
just rebase the payments scale... :-(
 
C

CBFalconer

Richard said:
.... snip ...

Here's a further suggestion that, IMHO, significantly improves on
LOC without requiring a great deal of complexity.

(a) pre-process the source - this saves headaches with comments
and conditional compilation and stuff;
(b) now get counting:

* for each semicolon, count 1.
* for each left parenthesis, count 1.
* for each operator, count 1.
* for each instance of 'if', count 1.
* for each instance of 'for' or 'while', count 2.
* for each instance of 'case', count 1.
* for each instance of 'continue', count 1.
* for each instance of 'break', count 1.
* for each instance of 'goto', count 5.
* for each instance of 'setjmp' or 'longjmp', count 10.

(Adjust figures to taste.)

A simpler method is to use a suitable reference compiler (say gcc
3.2.1) and compile for pure code (i.e. no debuggery, no symbol
names, etc.) on a given platform (say the X86). Also eliminate any
optimization. Now count bytes of generated code, ignoring
relocation tables.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top