How do linkers work?

J

jacob navia

OK, after the stack and the debuggers, let's look a little bit
more in depth into this almost ignored piece of the language,
the linker.

Obviously, the C standard doesn't mention this [1]. And almost
never we discuss it here.

Like many other things, this is an error because the linker
is an *essential* piece of the language. Without it, nothing
would ever work.

Separate compilation
--------------------
C supports the separate compilation of modules. Each module
is compiled into an independent object code file (.o in Unix,
or .obj under Microsoft) and those separate object files are
assembled into the executable by the "link editor" or linker
for short.

There are several standards for object file formats:
o The "ELF" format used in most Unix systems
o The COFF format used under windows 32 bit
o The OMF format used by 16 bit DOS/Windows systems

and many others I do not know... :)

What is important in the context of this discussion, is
what is inside from the language viewpoint.

An object file contains:
o A symbol table of exported symbols
o Several "sections" of data.
o Relocation information


Sections
--------
The "sections" are logical parts of the program that should be
assembled into the final executable. Basically we have 3 kind
of sections:
1) The code section, i.e. we have here the binary opcodes for the
processor
2) The data section, i.e. the initialized tables, strings, or
numbers that are contained in the module
3) The non initialized data section, that is basically just
a size information: XXX bytes should be reserved for non
initialized variables

For example;

int function(char *a)
{
static int bss;

if (strcmp(a,"foobar"))
return 42;
else
return 366554 + bss++;
}

In the code section we would have:
o The prologue code
o The call, the if, and the return with its
o epilogue code

In the data section we would have the "foobar" array
of characters followed by a zero, the number 42 and
the number 366554 in case the processor doesn't support
inlined integer constants. If the processor DOES support
inlined constant values (the x86 for example), the two
integers values would go in the code section

The non-initialized section would contain sizeof(int)
bytes to hold the integer called "bss".

Relocations
-----------
The symbol "strcmp" is not defined in the module, and its
address is not known at compile time. The object module
contains just a record to indicate to the linker:

From: compiler
To: linker

Dear Linker:
Please fill at the offset 4877 in the code section, sizeof(void *)
bytes with the address of the symbol "strcmp".

Thanks in advance

Your compiler

The relocations can be much more complicated than that, but basically,
all of them are just that.

The symbol table
----------------
The object module defines some symbols, and imports some symbols
from other modules. All those symbols are specified in the object
module symbol table. In some object code formats we find also
debug information records in the symbol table. In others,
the debug information is written into a separate section.

Libraries
Static libraries are just a bunch of object code modules that
are stored into a single file for convenience reasons. They
are seen by the linker in the same way as many object files.

----------------------------------------------------------------------

With all this information, the linker goes through all object files
noting which symbols are defined in which module, which symbols are
required from one module and defined in another, until there are no
more object files or libraries. It checks then that all symbols are
defined (if not will complain) and builds the executable.

Linkers can be very complex beasts, like, for instance, the gnu "ld"
linker. This is a linker that features:
o A "link editor language", that allows you to change the
workings of the linker and describe your own executable
format...
o An apparent "machine independence" (what does this means in
a linker is not obvious to me) that allows it to link
object modules from different formats...
o A "BFD" format, that is a kind of GNU machine independent
object file format, or similar.

Other linkers, like lcc-win's for instance are completely stupid beasts
that can only link the format generated by lcc-win and nothing else.
Obviously, the only thing *you* care about a linker is how fast it is,
so in this sense, lcc-win is a better choice: it is quite fast. But
you pay the price: it can only link lcc-win's code...

In the next installment we will go in detail into the dark corners of
the linkers, specifically, the problems with symbol collision.
-------------
[1] The only mention of the linker in the standard is when
speaking about extended characters in identifiers, it mentions

<quote>

On systems in which linkers cannot accept extended characters, an
encoding of the universal character name may be used in forming valid
external identifiers. For example, some otherwise unused
character or sequence of characters may be used to encode the \u in a
universal character name. Extended characters may produce a long
external identifier.

<end quote>

Nowhere is the "linker" defined.
 
R

Richard Heathfield

jacob navia said:
OK, after the stack and the debuggers, let's look a little bit
more in depth into this almost ignored piece of the language,
the linker.

Obviously, the C standard doesn't mention this [1].

See 2.1.1.2(8) of C89 or 5.1.1.2(8) of C99.

<snip>
 
J

jacob navia

Richard said:
jacob navia said:
OK, after the stack and the debuggers, let's look a little bit
more in depth into this almost ignored piece of the language,
the linker.

Obviously, the C standard doesn't mention this [1].

See 2.1.1.2(8) of C89 or 5.1.1.2(8) of C99.

<snip>

5.1/1.2(8) says:
<quote>
All external object and function references are resolved. Library
components are linked to satisfy external references to functions and
objects not defined in the current translation. All such translator
output is collected into a program image which contains information
needed for execution in its execution environment.
<end quote>

Not really a specification of a linker!

It does NOT say:

1) What to do when several modules define the same symbol
2) What to do when a symbol has contradictory definitions
in different modules.

We will see what consequences those omissions have in the next
installment.
 
R

Richard Heathfield

jacob navia said:
Richard said:
jacob navia said:
OK, after the stack and the debuggers, let's look a little bit
more in depth into this almost ignored piece of the language,
the linker.

Obviously, the C standard doesn't mention this [1].

See 2.1.1.2(8) of C89 or 5.1.1.2(8) of C99.
Not really a specification of a linker!

It isn't intended to be.
It does NOT say:

1) What to do when several modules define the same symbol

The behaviour is undefined (see 6.2.2).
2) What to do when a symbol has contradictory definitions
in different modules.

The behaviour is undefined (see 6.7(4)).
 
K

Keith Thompson

jacob navia said:
OK, after the stack and the debuggers, let's look a little bit
more in depth into this almost ignored piece of the language,
the linker.

Obviously, the C standard doesn't mention this [1]. And almost
never we discuss it here.

Like many other things, this is an error because the linker
is an *essential* piece of the language. Without it, nothing
would ever work.
[...]

Consider that all compiled languages depend on linkers just as much as
C does. If a post would be just as relevant to comp.lang.whatever as
it is to comp.lang.c, why post it to comp.lang.c rather than
comp.lang.c++ or comp.lang.fortran?
 
T

Tony Giles

Keith said:
[...]

Consider that all compiled languages depend on linkers just as much as
C does. If a post would be just as relevant to comp.lang.whatever as
it is to comp.lang.c, why post it to comp.lang.c rather than
comp.lang.c++ or comp.lang.fortran?

Sorry Keith, I don't really understand what you are saying here.Are you
saying that subjects like this should be cross posted to
comp.lang.whatever (and risk the wrath of the usual suspects) or not be
posted at all?

I am one year into programming now and hope to make a living out of it
someday. Personally speaking I have zero experience with other languages
outside of C and hence have found Jacob's recent posts about stacks,
debugging and linking very informative - if he had posted them to
comp.lang.fortran or wherever I would have missed them.

I have learned one hell of a lot here recently just by browsing (when I
get completely stuck I'm sure I'll be asking questions) but have been
getting increasingly pissed off with the "off topic" brigade and the
(seeming) mob of regulars who just bitch about one and other.

To all: give a thought for us learners. I for one am here for an
education - in the art of C and programming in general. For me, topics
as mentioned before are very much on topic. Maybe I should trawl through
comp.lang.endless but I'd rather the one stop on what I am learing!
 
F

Flash Gordon

Tony Giles wrote, On 24/03/08 07:23:
Keith said:
[...]

Consider that all compiled languages depend on linkers just as much as
C does. If a post would be just as relevant to comp.lang.whatever as
it is to comp.lang.c, why post it to comp.lang.c rather than
comp.lang.c++ or comp.lang.fortran?

Sorry Keith, I don't really understand what you are saying here.Are you
saying that subjects like this should be cross posted to
comp.lang.whatever (and risk the wrath of the usual suspects) or not be
posted at all?

Keith is pointing out it is not really topical here. There is
comp.lang.programming for general programming, comp.lang.misc (I've not
looked in to), comp.compilers, OS specific groups etc.
I am one year into programming now and hope to make a living out of it
someday. Personally speaking I have zero experience with other languages
outside of C and hence have found Jacob's recent posts about stacks,
debugging and linking very informative - if he had posted them to
comp.lang.fortran or wherever I would have missed them.

That does not make them topical here. See above for suggestions of other
groups where they could be topical.
I have learned one hell of a lot here recently just by browsing (when I
get completely stuck I'm sure I'll be asking questions) but have been
getting increasingly pissed off with the "off topic" brigade and the
(seeming) mob of regulars who just bitch about one and other.

So you are pissed off with the people who point out when something is
off topic but *not* with the people who come back and say (often with
insults) that it is on topic?
To all: give a thought for us learners. I for one am here for an
education - in the art of C and programming in general. For me, topics
as mentioned before are very much on topic. Maybe I should trawl through
comp.lang.endless but I'd rather the one stop on what I am learing!

You will be looking for a job at some point, should job applications be
topical here? If you want to learn about other topics you have to look
in other places, just as you need multiple text books, where is the
problem with that as the other places *do* exist?
 
B

Ben Bacarisse

Tony Giles said:
Keith said:
[...]

Consider that all compiled languages depend on linkers just as much as
C does. If a post would be just as relevant to comp.lang.whatever as
it is to comp.lang.c, why post it to comp.lang.c rather than
comp.lang.c++ or comp.lang.fortran?

Sorry Keith, I don't really understand what you are saying here.Are
you saying that subjects like this should be cross posted to
comp.lang.whatever (and risk the wrath of the usual suspects) or not
be posted at all?

I think he is proposing a test for topicality: if it is topical only
here post it here, otherwise post it to a single more suitable group
(where one exists). Cross-posting should be a last resort and should be
done to the very smallest possible set of groups.
I am one year into programming now and hope to make a living out of it
someday. Personally speaking I have zero experience with other
languages outside of C and hence have found Jacob's recent posts about
stacks, debugging and linking very informative - if he had posted them
to comp.lang.fortran or wherever I would have missed them.

They belong in comp.programming -- a group that would have benefited
from a lively discussion of various approaches to debugging.
For me, topics
as mentioned before are very much on topic. Maybe I should trawl
through comp.lang.endless but I'd rather the one stop on what I am
learing!

You should probably add comp.programming. comp.lang.c can't include
everything that you should be learning about.
 
R

Richard Heathfield

Tony Giles said:

I am one year into programming now and hope to make a living out of it
someday. Personally speaking I have zero experience with other languages
outside of C and hence have found Jacob's recent posts about stacks,
debugging and linking very informative - if he had posted them to
comp.lang.fortran or wherever I would have missed them.

How can you tell whether the information he has presented is authoritative?
If such article are posted in the kind of group where they are topical,
they stand a much higher chance of getting proper peer review.

To all: give a thought for us learners.

We do. We assume you come here to learn more about C, and as a group we
provide an astoundingly authoritative resource - on C. Not on debuggers,
linkers, stacks, and the like. If you want peer-reviewed, authoritative
articles on those subjects, you'd be better off finding a group where
debuggers, linkers, and stacks are topical, because that's where you're
most likely to find the experts.
 
E

Eric Sosman

jacob said:
OK, after the stack and the debuggers, let's look a little bit
more in depth into this almost ignored piece of the language,
the linker.

Obviously, the C standard doesn't mention this [1]. And almost
never we discuss it here.
[...]

Your copy of the Standard must be incomplete. It appears
you haven't seen sections 5.1.1.1, 5.1.1.2, 6.2.2, and 6.9.
 
S

santosh

Richard said:
Tony Giles said:



How can you tell whether the information he has presented is
authoritative? If such article are posted in the kind of group where
they are topical, they stand a much higher chance of getting proper
peer review.



We do. We assume you come here to learn more about C, and as a group
we provide an astoundingly authoritative resource - on C. Not on
debuggers, linkers, stacks, and the like. If you want peer-reviewed,
authoritative articles on those subjects, you'd be better off finding
a group where debuggers, linkers, and stacks are topical, because
that's where you're most likely to find the experts.

The problem is there seem to be no general group on Usenet for linkers,
debugging etc. However I agree that they are more topical in groups
like comp.programming or comp.misc more than in groups like c.l.c or
c.l.fortran or such.
 
R

Richard

santosh said:
The problem is there seem to be no general group on Usenet for linkers,
debugging etc. However I agree that they are more topical in groups
like comp.programming or comp.misc more than in groups like c.l.c or
c.l.fortran or such.

Nonsense.

Basic common sense would tell any experienced IT person that comp.lang.c
would be full of C programmers who hopefully have a view on the tools of
the trade - namely the things you mention above.

If you (not you personally) do not wish to help - then don't. Leave it
to those of us who will.
 
Y

ymuntyan

jacob said:
OK, after the stack and the debuggers, let's look a little bit
more in depth into this almost ignored piece of the language,
the linker.
Obviously, the C standard doesn't mention this [1]. And almost
never we discuss it here.
[...]

Your copy of the Standard must be incomplete. It appears
you haven't seen sections 5.1.1.1, 5.1.1.2, 6.2.2, and 6.9.

He *is* right. Try to quote more than one place which mentions
a *linker*.

Of course there is a good reason for not mentioning linkers,
they are not mandatory and translation phases are just a model,
they need not to be separated, there need not to be a linker,
preprocessor or whatever, and so on. But, you know, there is
no stack either ;)

Yevgen
 
J

John Bode

Keith said:
Consider that all compiled languages depend on linkers just as much as
C does. If a post would be just as relevant to comp.lang.whatever as
it is to comp.lang.c, why post it to comp.lang.c rather than
comp.lang.c++ or comp.lang.fortran?

Sorry Keith, I don't really understand what you are saying here.Are you
saying that subjects like this should be cross posted to
comp.lang.whatever (and risk the wrath of the usual suspects) or not be
posted at all?

Linkers are defined by the platform, not the language; there is no C-
specific linker (AFAIK), just like there is no C++-specific linker or
Fortran-specific linker or Pascal-specific linker (am I sufficiently
showing my age?). The same linker can be used for all of the above;
in fact, most linkers can link modules compiled from different
languages. So any discussion of a particular linker is orthogonal to
discussion of a particular language. Not to mention that there are C
interpreters available, which eliminate linkers from the process
completely.

Keith's point is that linker discussions are just as topical in c.l.f
or c.l.c++ as they are here, which is not very. Again, the linker is
defined by the platform, not the language; as Jacob points out, object
file formats vary (ELF vs. COFF vs. OMF), linker parameters and
operations vary (knowing how to drive ld doesn't tell you much about
driving other linkers). And *none* of it is defined by, or really
relevant to, the C *language*, which is allegedly the focus of this
group, much as Jacob and a few others would like to believe otherwise.
I am one year into programming now and hope to make a living out of it
someday. Personally speaking I have zero experience with other languages
outside of C and hence have found Jacob's recent posts about stacks,
debugging and linking very informative - if he had posted them to
comp.lang.fortran or wherever I would have missed them.

The danger is believing that Jacob's descriptions are universally
applicable; they are not. As others have pointed out, there are
implementations of C that do not rely on a hardware stack. There's a
disease common to C programmers; in the '80s it was call "All The
World's A VAX" syndrome (today it would be called "All The World's A
Windows/x86 PC" syndrome). It's the inability of many programmers to
distinguish between behavior mandated by the language definition vs.
behavior that's a result of the particular implementation; i.e., they
don't know where the language ends and the platform begins. It's the
source of a lot of bad habits that plague the industry. And Jacob's
articles, while informative, run the danger of perpetuating the
problem.

Speaking as someone who's had to support multiple platforms
concurrently (try getting something to work on five flavors of Unix
*and* Windows *and* OS/2 *and* AS/400 sometime), knowing where the
language ends and the platform begins is a vital skill to have.
I have learned one hell of a lot here recently just by browsing (when I
get completely stuck I'm sure I'll be asking questions) but have been
getting increasingly pissed off with the "off topic" brigade and the
(seeming) mob of regulars who just bitch about one and other.

First of all, it's unreasonable to expect this group (or any other) to
cover any conceivable topic that's tangentially related to C
programming. The reason most of us try to redirect people is that a)
there are other newsgroups focused on their particular issue, so that
they'll get more and better quality help in those newsgroups, and b)
by trying to keep the discussion here focused on a single topic, this
group can cultivate a similar depth of expertise. I'd rather this
group be an inch wide and a mile deep than vice-versa. Sort of like
the difference between K&R2 and anything by Schildt.

Yes, some of us are a bit surly. That's unfortunate, but it's one of
the occupational hazards of writing C code for a living (that and the
permanent brain damage). If you don't like what those people have to
say, well, that's what killfiles are for. It's a lesson that has
taken me longer than it should have to learn, but life's too short to
get pissed off over what someone said on the internet.
To all: give a thought for us learners. I for one am here for an
education - in the art of C and programming in general. For me, topics
as mentioned before are very much on topic. Maybe I should trawl through
comp.lang.endless but I'd rather the one stop on what I am learing!

Again, think of it in terms of reference manuals; I have a book on
multithreaded programming on Windows in C++ somewhere. Does this
manual cover everything about C++ *and* Windows *and* threads? No.
It covers enough of C++ and Windows to provide context, but it won't
answer any questions about those topics in any depth. If you get
stuck on a C++ concept, you go to a different manual.

That's kind of how you need to look at technical newsgroups.
 
R

Richard Tobin

John Bode said:
Linkers are defined by the platform, not the language;

No, by both.
there is no C-
specific linker (AFAIK), just like there is no C++-specific linker or
Fortran-specific linker or Pascal-specific linker (am I sufficiently
showing my age?). The same linker can be used for all of the above;

Obviously a sufficiently complicated linker can be used for
everything. But until quite recently most linkers that were adequate
for C were *not* adequate for C++. Gcc's C++ for example had an
associated program "collect2" that provided the linkage features
absent in typical unix linkers (in particular, if I understand
correctly, support for initialisation by C++ constructors).
So any discussion of a particular linker is orthogonal to
discussion of a particular language.

No. C imposes certain requirements on a linker which my be more than
those required for some other languages and less than those required
for others.

-- Richard
 
E

Eric Sosman

jacob said:
OK, after the stack and the debuggers, let's look a little bit
more in depth into this almost ignored piece of the language,
the linker.
Obviously, the C standard doesn't mention this [1]. And almost
never we discuss it here.
[...]
Your copy of the Standard must be incomplete. It appears
you haven't seen sections 5.1.1.1, 5.1.1.2, 6.2.2, and 6.9.

He *is* right. Try to quote more than one place which mentions
a *linker*.

<Shrug.> The Standard doesn't mention a multiplier,
either, but that doesn't interfere with its defining of
multiplication operators. The Standard tries very hard
to describe *what* must occur without describing *how*
the occurrence is brought about; in modern jargon, it
describes the "interface" and not the "implementation."

The Standard requires that identical identifiers with
external linkage must refer to the same object or function
throughout the execution of the program. The Standard does
not specify any particular technique to make this happen;
it just describes the required outcome. Likewise, it says
that translation units can be translated separately and
later combined into a single program; the implementation
can use any technique it finds suitable, so long as they
meet the Standard's requirements.

The sections of the Standard that deal with these
matters, taken together, amount to a high-level requirements
specification for a "linker," and that's all a user of the
language needs. An implementor needs considerably more, of
course -- but then, the implementor also has to worry about
how to recognize and remove comments, how to perform macro
substitution, how to get stdin to be opened before main()
starts, and a mighty host of other important details. There
is no reason to single out the link-editing process any more
than to single out the mechanisms of malloc().
 
L

lawrence.jones

Tony Giles said:
Sorry Keith, I don't really understand what you are saying here.Are you
saying that subjects like this should be cross posted to
comp.lang.whatever (and risk the wrath of the usual suspects) or not be
posted at all?

I think he's saying they should be posted in a generic programming group
(like comp.programming).

-Larry Jones

I must have been delirious from having so much fun. -- Calvin
 
R

robertwessel2

What is important in the context of this discussion, is
what is inside from the language viewpoint.

An object file contains:
o A symbol table of exported symbols
o Several "sections" of data.
o Relocation information


You keep presenting this stuff as absolute, when it's not.

I compiled a C program with MSVC and the resulting object file has
*none* of those things in it, except in an indirect sort of way. I
happened to use /GL, for link time code generation, which results in
an intermediate representation (basically whatever parse or DAG tree
MS uses internally) being stored in the "object" file, and the back
end of the compiler is run by the linker on the whole collection of
partially compiled programs (which allows the back end to do
considerable inter-translation unit optimization). In fact the
"object" file contains the fully qualified path to c2.dll.

Nor is link-time code generation uncommon.
 
J

jacob navia

You keep presenting this stuff as absolute, when it's not.

I compiled a C program with MSVC and the resulting object file has
*none* of those things in it, except in an indirect sort of way.

You are dreaming. Suppose the following program:

#include <stdio.h>
#ifdef __LCC__
#include <stdint.h>
#else
typedef unsigned __int64 uint64_t;
#endif
#include <stdlib.h>

int main()
{

int intLoop; //Variable Just For Looping
uint64_t UINT64_Test; //uint64_t variable
uint64_t *UNIT64_Array_Test; //uint64_t 1 dimensional array

//allocate 1 element for out array
UNIT64_Array_Test = (uint64_t *) calloc(1,sizeof(uint64_t));


//initialise variable and our array element
UINT64_Test = 4294967295;
UNIT64_Array_Test[0] = 4294967295;

//Output to confirm variable + our array element populated OK
printf("UINT64_Test = %llu\n",UINT64_Test);
printf("UNIT64_Array_Test[0] = %llu\n",UNIT64_Array_Test[0]);

//Increment both the array and the variable... 100000 times
for(intLoop=0;intLoop < 1;intLoop++){
UINT64_Test++;
UNIT64_Array_Test[0]++;
}

//Output the results
printf("UINT64_Test = %llu\n",UINT64_Test);
printf("UNIT64_Array_Test[0] = %llu\n",UNIT64_Array_Test[0]);


//Secondary test... confirm array is actually capable of
holding data...
UNIT64_Array_Test[0] = 1234567890123456789;
printf("UNIT64_Array_Test[0] = %llu\n",UNIT64_Array_Test[0]);

//and add one to the new large value...
UNIT64_Array_Test[0]++;
printf("UNIT64_Array_Test[0] = %llu\n",UNIT64_Array_Test[0]);

return 0;
}
---------------------------------------------------------------------------
My utility "pedump" shows the contents of object files.
D:\lcc\mc71\test>pedump /summary tll.obj
tll.obj 1513 bytes, linked Mon Mar 24 20:25:53 2008

Section Name Size
01 .drectve 47
02 .debug$S 104
03 .data 173
04 .text 270
05 .pdata 12
06 .xdata 8

Sections "data" and "text" (code) are well present.
I can see the symbol table:


Symbol Table - 25 entries of 18 bytes each (* = auxiliary symbol)
Indx Name Value Section cAux Type Storage
---- -------------------- -------- ---------- ----- ------- --------
0000 @comp.id 07194151 sect:ABS aux:0 type:00 st:STATIC
0001 .drectve 00000000 sect:1 aux:1 type:00 st:STATIC
* Section: 0000 Len: 00047 Relocs: 00000 LineNums: 00000
0003 .debug$S 00000000 sect:2 aux:1 type:00 st:STATIC
* Section: 0000 Len: 00104 Relocs: 00000 LineNums: 00000
0005 .data 00000000 sect:3 aux:1 type:00 st:STATIC
* Section: 0000 Len: 00173 Relocs: 00000 LineNums: 00000
0007 $SG3422 00000000 sect:3 aux:0 type:00 st:STATIC
0008 $SG3423 00000024 sect:3 aux:0 type:00 st:STATIC
0009 $SG3427 00000056 sect:3 aux:0 type:00 st:STATIC
0010 $SG3428 00000080 sect:3 aux:0 type:00 st:STATIC
0011 $SG3429 00000112 sect:3 aux:0 type:00 st:STATIC
0012 $SG3430 00000144 sect:3 aux:0 type:00 st:STATIC
0013 .text 00000000 sect:4 aux:1 type:00 st:STATIC
* Section: 0000 Len: 00270 Relocs: 00013 LineNums: 00000
0015 main 00000000 sect:4 aux:0 type:32 st:EXTERNAL
0016 .pdata 00000000 sect:5 aux:1 type:00 st:STATIC
* Section: 0000 Len: 00012 Relocs: 00003 LineNums: 00000
0018 $pdata$main 00000000 sect:5 aux:0 type:00 st:STATIC
0019 .xdata 00000000 sect:6 aux:1 type:00 st:STATIC
* Section: 0000 Len: 00008 Relocs: 00000 LineNums: 00000
0021 $unwind$main 00000000 sect:6 aux:0 type:00 st:STATIC
0022 printf 00000000 sect:UNDEF aux:0 type:32 st:EXTERNAL
0023 calloc 00000000 sect:UNDEF aux:0 type:32 st:EXTERNAL
0024 $LN6 00000000 sect:4 aux:0 type:00 st:LABEL

String Table
------------
Length: 29(d)
[ 1 4 16] "$pdata$main"
[ 2 16 29] "$unwind$main"


And I can see the relocations

Section 04 (.text) relocations

Address Type Symbol Index Symbol Name
------- ---- ------------ ----- ----
15 ???_4 23 calloc
55 ???_4 7 $SG3422
60 ???_4 22 printf
75 ???_4 8 $SG3423
80 ???_4 22 printf
156 ???_4 9 $SG3427
161 ???_4 22 printf
176 ???_4 10 $SG3428
181 ???_4 22 printf
214 ???_4 11 $SG3429
219 ???_4 22 printf
254 ???_4 12 $SG3430
259 ???_4 22 printf

You are just spreading MISINFORMATION.


I
happened to use /GL, for link time code generation, which results in
an intermediate representation (basically whatever parse or DAG tree
MS uses internally) being stored in the "object" file, and the back
end of the compiler is run by the linker on the whole collection of
partially compiled programs (which allows the back end to do
considerable inter-translation unit optimization). In fact the
"object" file contains the fully qualified path to c2.dll.

This is a CONFIRMATION of what I am saying!

Object files can be generated at link time, but conceptually
they contain the same stuff.

There are versions of lcc-win that do not generate any object file
or even executable file, they generate code "on the fly" and execute
it immediately.

So What?

Are you going to argue then that "executable files" do not exist?

Nor is link-time code generation uncommon.

It doesn't matter from the standpoint of the conceptual
framework we are discussing.
 
R

robertwessel2

You keep presenting this stuff as absolute, when it's not.
I compiled a C program with MSVC and the resulting object file has
*none* of those things in it, except in an indirect sort of way.  

You are dreaming. Suppose the following program:

#include <stdio.h>
#ifdef __LCC__
#include <stdint.h>
#else
typedef unsigned __int64 uint64_t;
#endif
#include <stdlib.h>

int main()
{

         int intLoop; //Variable Just For Looping
         uint64_t UINT64_Test; //uint64_t variable
         uint64_t *UNIT64_Array_Test; //uint64_t 1 dimensional array

         //allocate 1 element for out array
         UNIT64_Array_Test = (uint64_t *) calloc(1,sizeof(uint64_t));

         //initialise variable and our array element
         UINT64_Test = 4294967295;
         UNIT64_Array_Test[0] = 4294967295;

         //Output to confirm variable + our array element populated OK
         printf("UINT64_Test = %llu\n",UINT64_Test);
         printf("UNIT64_Array_Test[0] = %llu\n",UNIT64_Array_Test[0]);

         //Increment both the array and the variable... 100000 times
         for(intLoop=0;intLoop < 1;intLoop++){
                 UINT64_Test++;
                 UNIT64_Array_Test[0]++;
         }

         //Output the results
         printf("UINT64_Test = %llu\n",UINT64_Test);
         printf("UNIT64_Array_Test[0] = %llu\n",UNIT64_Array_Test[0]);

         //Secondary test... confirm array is actually capable of
holding data...
         UNIT64_Array_Test[0] = 1234567890123456789;
         printf("UNIT64_Array_Test[0] = %llu\n",UNIT64_Array_Test[0]);

         //and add one to the new large value...
         UNIT64_Array_Test[0]++;
         printf("UNIT64_Array_Test[0] = %llu\n",UNIT64_Array_Test[0]);

         return 0;}

---------------------------------------------------------------------------
My utility "pedump" shows the contents of object files.
D:\lcc\mc71\test>pedump /summary tll.obj
tll.obj 1513 bytes, linked Mon Mar 24 20:25:53 2008

Section Name       Size
    01   .drectve   47
    02   .debug$S   104
    03   .data      173
    04   .text      270
    05   .pdata     12
    06   .xdata     8

Sections "data" and "text" (code) are well present.
I can see the symbol table:

Symbol Table - 25 entries of 18 bytes each (* = auxiliary symbol)
Indx Name                 Value    Section    cAux  Type    Storage
---- -------------------- -------- ---------- ----- ------- --------
0000 @comp.id             07194151 sect:ABS   aux:0 type:00 st:STATIC
0001 .drectve             00000000 sect:1     aux:1 type:00 st:STATIC
      * Section: 0000  Len: 00047  Relocs: 00000  LineNums: 00000
0003 .debug$S             00000000 sect:2     aux:1 type:00 st:STATIC
      * Section: 0000  Len: 00104  Relocs: 00000  LineNums: 00000
0005 .data                00000000 sect:3     aux:1 type:00 st:STATIC
      * Section: 0000  Len: 00173  Relocs: 00000  LineNums: 00000
0007 $SG3422              00000000 sect:3     aux:0 type:00 st:STATIC
0008 $SG3423              00000024 sect:3     aux:0 type:00 st:STATIC
0009 $SG3427              00000056 sect:3     aux:0 type:00 st:STATIC
0010 $SG3428              00000080 sect:3     aux:0 type:00 st:STATIC
0011 $SG3429              00000112 sect:3     aux:0 type:00 st:STATIC
0012 $SG3430              00000144 sect:3     aux:0 type:00 st:STATIC
0013 .text                00000000 sect:4     aux:1 type:00 st:STATIC
      * Section: 0000  Len: 00270  Relocs: 00013  LineNums: 00000
0015 main                 00000000 sect:4     aux:0 type:32 st:EXTERNAL
0016 .pdata               00000000 sect:5     aux:1 type:00 st:STATIC
      * Section: 0000  Len: 00012  Relocs: 00003  LineNums: 00000
0018 $pdata$main          00000000 sect:5     aux:0 type:00 st:STATIC
0019 .xdata               00000000 sect:6     aux:1 type:00 st:STATIC
      * Section: 0000  Len: 00008  Relocs: 00000  LineNums: 00000
0021 $unwind$main         00000000 sect:6     aux:0 type:00 st:STATIC
0022 printf               00000000 sect:UNDEF aux:0 type:32 st:EXTERNAL
0023 calloc               00000000 sect:UNDEF aux:0 type:32 st:EXTERNAL
0024 $LN6                 00000000 sect:4     aux:0 type:00 st:LABEL

String Table
------------
Length: 29(d)
[  1     4    16] "$pdata$main"
[  2    16    29] "$unwind$main"

And I can see the relocations

Section 04 (.text) relocations

Address  Type    Symbol Index Symbol Name
-------  ----    ------------ ----- ----
15       ???_4          23   calloc
55       ???_4           7   $SG3422
60       ???_4          22   printf
75       ???_4           8   $SG3423
80       ???_4          22   printf
156      ???_4           9   $SG3427
161      ???_4          22   printf
176      ???_4          10   $SG3428
181      ???_4          22   printf
214      ???_4          11   $SG3429
219      ???_4          22   printf
254      ???_4          12   $SG3430
259      ???_4          22   printf

You are just spreading MISINFORMATION.

I
happened to use /GL, for link time code generation, which results in
an intermediate representation (basically whatever parse or DAG tree
MS uses internally) being stored in the "object" file, and the back
end of the compiler is run by the linker on the whole collection of
partially compiled programs (which allows the back end to do
considerable inter-translation unit optimization).  In fact the
"object" file contains the fully qualified path to c2.dll.

This is  a CONFIRMATION of what I am saying!

Object files can be generated at link time, but conceptually
they contain the same stuff.

There are versions of lcc-win that do not generate any object file
or even executable file, they generate code "on the fly" and execute
it immediately.

So What?

Are you going to argue then that "executable files" do not exist?
Nor is link-time code generation uncommon.

It doesn't matter from the standpoint of the conceptual
framework we are discussing.


Oh, it's a *conceptual framework.*

So those ".obj" files the beginning user, who you claim you're
attempting to educate, sees are just artifacts of the implementation,
and are not the "real" object files, which, may, in fact, not ever
physically exist at all? You mean like the use of a hardware stack is
an artifact of a particular implementation, commonly used to implement
the LIFO nesting of activation records that is required by the
language?

An even if we accept this notion of a conceptual framework, the object
file you're describing doesn't typically exist on a translation-unit
basis, rather, the "internal" object file that's created is typically
a composite of *all* the translation units.

If you were to throw a few qualifiers into the mix, there would likely
be many fewer objections to what you're posting.

As to executable files, I'm not sure they need to exist. A compiled
program burned into a ROM probably doesn't meet many people's
definition of an executable file, nor does the ROM image file,
although both of those can be debated. I've also used systems, not
with C, but other compiled languages, where the linker had a "link-and-
run" option, where it would execute the program immediately upon
linking, and would never write anything like a traditional executable
to disk.

And why, oh, why, did you feel compelled to post an example of the
data in an object file when compiled *without* runtime linking, when I
specifically said that it was *with* runtime linking that the
traditional object file information was not present?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,022
Latest member
MaybelleMa

Latest Threads

Top