Length of Variable Names Affect Compiled Executable?

J

John

Does the length of my C variable names have any affect, performance-wise, on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n = 3;

I know its setting aside storage for the variable itself; does it also use
up more storage if the variable name is longer? I realize it would probably
take quite a lot of long variable names to make any impact if so, but I was
just curious if anyone knew...
 
T

Thad Smith

John said:
Does the length of my C variable names have any affect, performance-wise, on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n = 3;

The Standard doesn't say.

Normally compiled code will be the same. It is conceivable for a C
intepreter to take longer.
 
W

Walter Roberson

The Standard doesn't say.
Normally compiled code will be the same. It is conceivable for a C
intepreter to take longer.

-potentially- there might be debug information in the executable
that records the original variable names; a larger variable name might
result in a larger debug section; depending on the OS details,
that -might- result in a slightly larger initiation time (e.g. because
a larger file is being parsed apart to link and load.) But likely
any such effect would be trivial compared to the other factors affecting
executable file size, such as the amount of code, or the optimization
level.
 
I

iliwoy5

John said:
Does the length of my C variable names have any affect, performance-wise, on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n = 3;

I know its setting aside storage for the variable itself; does it also use
up more storage if the variable name is longer? I realize it would probably
take quite a lot of long variable names to make any impact if so, but I was
just curious if anyone knew...

If we compare variable names in C with in assemble language,we can see
as follows:
In C:
int number = 3;
int n;
n=number;

In asm: (ebp point to the top of stack before going into the
procedure.)
mov dword ptr [ebp-4],3
mov eax,dword ptr [ebp-4]
mov dword ptr [ebp-8],eax

The C compiler hold a table to storage the variable names,but after
compiled to assemble language,the table will be distroyed.

In a same question:
We can write in assemble language like this:
In asm:
..data
number dw 3
n dw ?
....
..code
....
mov ax,number
mov n,ax
....
the same question is :is there any difference between these two
variable names?
After it complied to opcode,it may be executed like this:
....
A10000 ( mov ax,[0000])
A30200 ( mov [0002],ax)
....

So,the table holded by assemble compiler will be destroyed after it
compiled to opcode.
 
J

jmcgill

John said:
I know its setting aside storage for the variable itself; does it also use
up more storage if the variable name is longer?

No. On some compilers, if you compile with a "debugging symbols" switch
enabled, a symbol table of some sort will be generated which will make
the code larger, but it would be a rather broken implementation that led
to a performance cost as a result.
 
G

Gordon Burditt

Does the length of my C variable names have any affect, performance-wise, on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n = 3;

By posting your article, you've probably used up much more CPU time than
your program will ever use during its lifetime. So stop worrying about
its performance.

The C standard does not require a program to have any kind of
performance: good, bad, sucky, awful, better than, or worse than.
I know its setting aside storage for the variable itself; does it also use
up more storage if the variable name is longer? I realize it would probably
take quite a lot of long variable names to make any impact if so, but I was
just curious if anyone knew...

A program compiled with debugging symbols will probably take up
more *DISK* storage depending on symbol length. These symbols may
or may not ever get into memory when the program is run.

It is possible, especially if you have a habit of using terabyte-long
symbol names, that name length could affect the runtime of the
dynamic linker that might be used on program startup.
 
?

=?ISO-8859-1?Q?=22Nils_O=2E_Sel=E5sdal=22?=

Walter said:
-potentially- there might be debug information in the executable
that records the original variable names; a larger variable name might
result in a larger debug section; depending on the OS details,
that -might- result in a slightly larger initiation time (e.g. because
a larger file is being parsed apart to link and load.) But likely
any such effect would be trivial compared to the other factors affecting
executable file size, such as the amount of code, or the optimization
level.
Typically though, that debug information is not loaded by other
than special tools - such as a debugger.

Then again, this implementation specific.
 
A

Ark

John said:
Does the length of my C variable names have any affect, performance-wise, on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n = 3;

I know its setting aside storage for the variable itself; does it also use
up more storage if the variable name is longer? I realize it would probably
take quite a lot of long variable names to make any impact if so, but I was
just curious if anyone knew...

An often overlooked case is when by using the stringize operator you
introduce names (or similar) to the code. E.g.
#define FYI(x) printf("FYI: %s = %d\n", #x, x)
................
FYI(n);
FYI(number_of_characters_grows);

A typical case of something similar is the assert macro (which people
sometimes leave active in the release code by #undef'ing NDEBUG by hand).

In these cases the length of a name matters - but all things considered,
not terribly much.
 
R

Richard Heathfield

John said:
Does the length of my C variable names have any affect, performance-wise,
on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n = 3;

I know its setting aside storage for the variable itself; does it also use
up more storage if the variable name is longer? I realize it would
probably take quite a lot of long variable names to make any impact if so,
but I was just curious if anyone knew...

I have read the parallel replies with great interest and amusement. It's
amazing how much juice you can squeeze from C.

Nevertheless, *in general* the answer to your question is "no". Yes, various
people have drawn attention to various exceptions to that answer, which you
may wish to take into account. But in the general case, your implementation
will (or can be told to) discard all such information.

Thus, using b instead of CurrentBalance is an unnecessary "optimisation"
which will, in the general case, have no impact on performance. Write
clear, readable code using good algorithms and descriptive identifiers.
Worry about performance only in the extremely rare cases where good
algorithms don't deliver sufficient performance to meet your users' needs.
 
W

Walter Roberson

Walter Roberson wrote:
Typically though, that debug information is not loaded by other
than special tools - such as a debugger.
Then again, this implementation specific.

Sure, typically not loaded, but unless the implementation's standards
for the format of executables is such that the debug information must
be the -last- thing in the executable, then the code that loads
the executable into working memory must read the debug information
to get past it to the other important information.

The loader might fseek() around the debug information and thus not
actually read it in, but fseek() can interfere with automatic OS
pre-read of files so fseek() is not -always- faster than just reading
the characters and throwing them away. Interestingly, using lots
of large variable names can, by increasing the debug section size,
drive the debug section size closer to the point where fseek() is
worthwhile ;-)
 
J

John Bode

John said:
Does the length of my C variable names have any affect, performance-wise, on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n = 3;

The two rules of optimization:

Rule 1: Don't do it.
Rule 2 (for experts only): Don't do it *yet*.

You could write two versions of the same code and compare their
performance. It would tell you whether it makes a difference for that
particular program on that particular platform. I strongly suspect you
won't find a measurable difference.
I know its setting aside storage for the variable itself; does it also use
up more storage if the variable name is longer? I realize it would probably
take quite a lot of long variable names to make any impact if so, but I was
just curious if anyone knew...

Any conceivable impact would be down in the noise; I doubt you would
gain enough in performance to justify the loss in readability.

Runtime performance is usually the fourth-most important criterion in
software, preceded by

1. Correctness -- the code must satisfy all requirements;
2. Robustness -- the code must not yak the instant it sees wonky data;
3. Maintainability -- requirements change, new features get added,
bugs must be fixed;

Using meaningful variable names, whether they're 1 or 6 or 32 or 64
characters long, goes a long way towards satisfying #3. There's a
warning about external symbol names being limited to 6 characters, but
I don't remember the exact wording.

There may be some domains where it can be argued that performance comes
before maintainability, but they're rare.

Micro-optimization is the root of all evil; it often results in
brittle, unmaintainable code. Focus on using efficient *algorithms*.
That'll get you 99.9% of the way there when it comes to performance.
 
J

Joe Wright

John said:
Does the length of my C variable names have any affect, performance-wise, on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n = 3;

I know its setting aside storage for the variable itself; does it also use
up more storage if the variable name is longer? I realize it would probably
take quite a lot of long variable names to make any impact if so, but I was
just curious if anyone knew...
No difference. The compiled code has no idea as to the original names of
variables.
 
R

raxitsheth2000

Hmm,

Would it make any difference extern int i; OR extern int
this_is_also_another_i;

is there any relation of linking with Variable name on compiled module
? (is linker search for actuall variable name to make bin and also in
case of Dynamic Linking concept is it dependent on Variable/Function
Name ) (assume that DEBUG flag of compiler is OFF and code is
compiled/link for Optimized output.)

(have to agree in any case when debug option is enbaled during
compilation file size are different)


--raxit sheth
 
W

Walter Roberson

Please do not top-post. Now I have to go and edit your reply
to put it into a form suitable for conducting a discussion.
Would it make any difference extern int i; OR extern int
this_is_also_another_i;
is there any relation of linking with Variable name on compiled module
? (is linker search for actuall variable name to make bin and also in
case of Dynamic Linking concept is it dependent on Variable/Function
Name ) (assume that DEBUG flag of compiler is OFF and code is
compiled/link for Optimized output.)

Anything along those lines is implementation specific.

On systems that use only 'static linking', then the external
names are fully resolved by the linker and any persistance into the
executable would be only for debugging purposes. Thus on such
systems, any performance effect on the "final executable program"
would be limited to the ones previously discussed on this thread,
about possibly larger file storage for an unstripped executable.

On systems that allow run-time linking to pre-specified
shared libraries, the names to be linked against must be present
in the the file that will undergo the final link. However, even
in the case of shared libraries, after the final link done by
the OS at the time the image is loaded for execution, the symbols
can be discarded. The performance impact of final linking against
shared libraries is highly OS dependant. We would probably be on
safe ground in assuming that the fewer symbols linked against,
the faster the final link, but even that is on shaky grounds as
the cost of doing a link depends on the number of times the
symbol address must be resolved, which could vary with optimization,
whether decreased (dead code elimination) or increased (loop
unrolling.) And if the symbol tables are kept in sorted order,
only an incremental search for the next symbol is needed. Then too,
counting just the number of symbols doesn't tell us anything about
how -many- shared libraries the system is going to look through to
resolve them all, and it doesn't tell us anything about the number
of extra symbols the shared library is going to pull in for use
in its code. You'd need a *lot* of information in order to
meaningfully analyze the cost of linking against shared
libaries.

Does a longer symbol name increase the cost of linking against a
run-time library? Not necessarily. The C89 and C99 standards
impose (different) lower bounds on the number of characters of
an identifier that must be considered signficant for linking
purposes, but that implies that an implementation could use
fixed length fields to store symbol names for linking purposes.
You don't know unless you dig down into the implementation details.

Run time linking to code in other files ("dynamic linking")
is not a feature required (or mentioned) in the C standards, so
the impact on execution speed of using different lengths of
symbols is again difficult to predict. It does differ from the
pure shared library case in that the executable must either store
or compute the symbol name in order to present to the dynamic linker,
whereas with the pure shared library case after the final link the
symbols could be discarded, so in the dynamic linking case,
either the executable must be larger to store a longer string,
or else the run-time memory use would be larger (to dynamically
allocate a longer string)... but then again, the memory for the
symbol name could be allocated as a fixed length buffer as
an automatic variable so if the name were being computed it might
be a fixed amount of storage rather than a dynamic amount...
depends how you implement the name computation. The time to effect
a dynamic link is almost certainly much higher than any additional
time that might be involved if the symbol is longer... which
might be a really trivial time difference, if the implementation
does something like hash the symbol name and do the primary
symbol table lookup based on the hash value... And then too,
it could happen that for the longer symbol name, only a small
amount of code is brought in, but that for the shorter symbol name
that additional dynamic searching was necessary in order to bring
in code called by the referenced code...


Too much variability, too many possible implementations, too much
dependance on exactly what is linked.


The whole thing is about like asking whether the post office takes
longer to deliver a letter if the address is longer.
 
M

Mark McIntyre

On 28 Sep 2006 03:18:48 -0700, in comp.lang.c ,
Hmm,

Would it make any difference extern int i; OR extern int
this_is_also_another_i;

is there any relation of linking with Variable name on compiled module?

It makes no difference at all what you call your variables, provided
they don't break the length limits or uniqueness rules. C89 requires
identifiers to unique to 6 characters, C99 to (?) 32. Most compilers
define a maximum variable name length.
(is linker search for actuall variable name to make bin and also in
case of Dynamic Linking concept is it dependent on Variable/Function
Name ) (assume that DEBUG flag of compiler is OFF and code is
compiled/link for Optimized output.)

No, the compiler and linker work off addresses. The names have to be
the same in different modules, but only so that the compiler knows to
create pointers to the same address.

This is true even with dynamic linking. The only difference here is
that the name is matched at runtime instead of at build time. Again
the only requirement is that the two are the same, what they are is
irrelevant.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 
J

jmcgill

Mark said:
C89 requires identifiers to unique to 6 characters, C99 to (?) 32.

You are close, but only for external linkage.

C99 internal identifiers and macro names are significant to 63 chars,
external identifiers to 31 chars, 4095 chars in a logical source line.

C89 values are: internal identifiers and macro names: 31 chars, external
identifiers: 6 chars, and 509 chars in a logical source line.

Has any of these limits ever legitimately caused a problem for anyone
who was not intentionally abusing the grammar?



What about these? (Not sure what standard they refer to -- picked them
from a longer list at http://www-ccs.ucsd.edu/c/portable.html)


No more than 511 distinct names with external linkage within a
translation unit.

No more than 31 arguments in a function call.

No more than 31 nested pairs of parentheses in a declarator.

No more than 32 nested pairs of parentheses within an expression.

No more than 127 members in any one structure or union.

Nest structure or union definitions no more than fifteen deep in any one
list of member declarations.

I have probably relied on a violation of this one more than once:

String literal contains more than 509 characters or wide characters.
 
M

Mark McIntyre

You are close, but only for external linkage.

Bear in mind that "external" means outside the current module, which
is what the OP was asking about.
Has any of these limits ever legitimately caused a problem for anyone
who was not intentionally abusing the grammar?

Yes, I've had internal problems with 3rd party code being ported to a
different platform, where the original designers had created functions
called gruesome things like

ThisIsFunctionThirtyTwoVersionOneA
ThisIsFunctionThirtyTwoVersionOneB

and I've had external linkage problems with shorter names and older
compilers, epsecialyl when trying to link FORTRAN libraries.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top