What does linking library components mean in C++

S

Steven T. Hatton

I just read this in the description of how C++ is supposed to be
implemented:

"All external object and function references are resolved. Library
components are linked to satisfy external references to functions and
objects not defined in the current translation. All such translator output
is collected into a program image which contains information needed for
execution in its execution environment."

What I'm wondering is what exactly I'm supposed to understand that to mean.
Typically when I think of linking, it means arranging different parts of
the object code so that symbols can be resolved to relative memory
locations within the resulting assembly. The standard doesn't give a clear
definition of "library", nor does it give a very clear notion of "linking".
How do others read that paragraph?
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Steven said:
The standard doesn't give a clear definition of "library", nor does
it give a very clear notion of "linking". How do others read that
paragraph?

The standard does not care about that. Each operating system and family of
tools can have his own notions about how linking is done and librarys are
stored and handled.

You can view a library, in the linking sense, just as a couple of object
files stored togheter in some way.

Code can be linked by resolving symbols to relative or absolute adresses, to
references to symbols in dynamic loading libraries... Or can be not
resolved to any type of address, delaying that to loading or run time.
 
P

Pete Becker

Steven said:
I just read this in the description of how C++ is supposed to be
implemented:

That's not how it's supposed to be implemented. That's the compilation
model that the standard is based on. Different compilation models are
perfectly acceptable, as long as they produce the same results. For
example, many compilers do preprocessing as an integral part of the rest
of the compilation, not as a separate phase.
"All external object and function references are resolved. Library
components are linked to satisfy external references to functions and
objects not defined in the current translation. All such translator output
is collected into a program image which contains information needed for
execution in its execution environment."

What I'm wondering is what exactly I'm supposed to understand that to mean.
Typically when I think of linking, it means arranging different parts of
the object code so that symbols can be resolved to relative memory
locations within the resulting assembly. The standard doesn't give a clear
definition of "library", nor does it give a very clear notion of "linking".
How do others read that paragraph?

It's deliberately vague, to avoid unnecessary contraints. Your reading
is a reasonable one, and it's how most implementations handle it.
 
S

Steven T. Hatton

Pete said:
That's not how it's supposed to be implemented. That's the compilation
model that the standard is based on. Different compilation models are
perfectly acceptable, as long as they produce the same results. For
example, many compilers do preprocessing as an integral part of the rest
of the compilation, not as a separate phase.

That was a bad choice of wording on my part. I stand corrected. It is a
description of an abstract machine which defines the behaviors required of
an implementation. With the exception of where the Standard specifically
states that the implementation may deviate. If I read the wording
correctly, it is possible for an implementation to execute the same program
with the same starting conditions, and produce different results each run.
It is for this reason that the abstract machine is said to be
non-deterministic.

I guess that has to do with things such as the order of evaluation of
expressions in a function call argument list. Stroustrup commented on this
with words similar to 'I've been told this is done for purposes of
performance.' Which makes me wonder if he really buys the argument. I
wonder what the cost/benefit ratio is for that.

http://www.gotw.ca/gotw/056.htm
It's deliberately vague, to avoid unnecessary contraints. Your reading
is a reasonable one, and it's how most implementations handle it.

What I'm trying to get at is what the abstract notion of linking, and
library are. I guess what the paragraph essentially means is that the
processor should have some means of retrieving or modifying the value of an
externally referenced object, or calling an externally referenced function.
This linking could be done explicitly and immutably in the final phase in
the described translation, or it could be delayed until some later time,
but storing sufficient information to resolve the linkage in the future.
That could entail anything from dynamically loadding and linking, to
communicating with another process on a remote system which executes the
function, or provides the variable.
 
P

Pete Becker

Steven said:
What I'm trying to get at is what the abstract notion of linking, and
library are.

There is no abstract notion. These terms refer to whatever the
implementation does. Again: it's deliberately vague. Don't try to force
a detailed meaning on it.
 
S

Steven T. Hatton

Pete said:
There is no abstract notion. These terms refer to whatever the
implementation does. Again: it's deliberately vague. Don't try to force
a detailed meaning on it.

I disagree about there being an abstract notion communicated in the
paragraph. To force a detailed meaning would not be to understand the
abstraction. See the signature. Abstraction only happens when there are
multiple examples of the underlying pattern. Therefore, it is useful to
formulate thought experiments to examine what circumstances actually to
satisfy the essential conditions to fit the abstraction class. It's an
iterative process of refinement. What is abstracted from the concrete
example can be designated as a form of symmetry. See, for example, Hermann
Weyl's _Symmetry_.

The authors of the paragraph I quoted probably did not have concepts of
object oriented programming in mind, and may well not have had dynamic
linking in mind at all. I suspect that when it was written, and in the
context it was written, it was not considered intentionally vague, it just
stated facts as they were.

Consider the possibility of modules which are similar to classes in their
protection mechanism, and that they can be instantiated once per program,
i.e., singletons. They would however share some characteristics of
namespaces, such as suport for `using' directives and declarations. They
would be primarily intended to group related program units into dynamicly
loadable libraries. Their names could be used to avoid ODR problems, as
could their data hiding features. Having the instantiability of a class
would enable the use of RAII to perform load-time initialization such as
starting a thread, loading data, obtaining resources, etc, and unload-time
shutdown actions, such as stopping the thread, saving data, freeing
resources. Of course some of the resources might be other modules. Since
these would present a clearly defined interface to their external
environment, they would probably serve well as distributed modules in the
sense of EJB, or CORBA.

But I repeat myself.

You might find this interesting. It's a proposal to actually change the
wording of the paragraph in the C++ Standard to better address issues of
creating dynamic linking. Obviously _someone_ thought the wording of that
paragraph is important, and has bearing on the topic.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1496.html#Program Model
 
Z

Zorro

You might find this interesting. It's a proposal to actually change the
wording of the paragraph in the C++ Standard to better address issues of
creating dynamic linking. Obviously _someone_ thought the wording of that
paragraph is important, and has bearing on the topic.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1496.html#Pr...

Yes, it was quite interesting. Thanks.

In response, I wrote the following blogger on the history of linkage.

http://distributed-software.blogspot.com/2005/07/linkage-mechanisms-and-z.html

I did so to avoid confusing "Z++ as a product" with "Z++ as an
abstraction for research". Researchers can see how C++ can be extended
towards a more abstract system programming language, with almost no
learning curve for a C++ engineer. The key is the mentality that "we do
not think about some one or more particular things".

Some of my views on an abstract software devemopment languages are:

http://distributed-software.blogspot.com/2005/07/pure-research-needs-abstract.html
http://distributed-software.blogspot.com/2005/07/abstract-language-is-indispensable-for.html
http://distributed-software.blogspot.com/2005/06/teaching-calculus-in-pseudo-code.html

Since these will result in discussions towards correcting and extending
C++, I also created a user group for such discussions. That way, we can
discuss future issues without bothering those who need coding help
based on current standard. Admittedly, mentioning any flaw in the
standard seems like a religious matter, when posted in this group.

http://groups-beta.google.com/group/computerlangZ?lnk=li&hl=en

I certainly appreciate the issues that you bring up. That was the
encouragement to create the new group. I hope my posting in no-way is
offensive. It is just a reaction to events.

BTW, I cannot release the source code due to the underlying theory in
the design of the virtual processor. But the binaries will always be
freely available to the public.

Regards,
Z.
 
P

Pete Becker

Steven said:
You might find this interesting. It's a proposal to actually change the
wording of the paragraph in the C++ Standard to better address issues of
creating dynamic linking. Obviously _someone_ thought the wording of that
paragraph is important, and has bearing on the topic.

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1496.html#Program Model

You might find it interesting to check the name of the author of that
paper. And note that I didnt' say that the wording of those paragraphs
isn't important. I said that it's vague, and deliberately so.
 
S

Steven T. Hatton

Pete said:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2003/n1496.html#Program Model

You might find it interesting to check the name of the author of that
paper. And note that I didnt' say that the wording of those paragraphs
isn't important. I said that it's vague, and deliberately so.

I'm not sure what the history of the paragraph is. It seems to have
originated in ANSI C, or perhaps ever earlier. The documentation that
descended from USL's documentation and implementation goes into extensive
detail in discussing the meaning of that paragraph. The wording may have
become vague over the years due to the changing context. It is probably
the case that it has been intentionally left unspecific in subsequent
revisions of the document.

One more thought on the original paragraph:

"I have reluctantly come to accept that some system-related issues would
have been better handled within C++. System-related issues, such as
dynamic linking of classes and interface evolution do not logically belong
in a language and language-based solutions are not preferable on technical
grounds. However, the language provides the only common forum in which a
truly standard solution can become accepted."

This may be one consequence of leaving things vague:

//=============================================================================
/**
* @file streams.h
*
* streams.h,v 1.34 2005/05/17 13:06:22 ******** Exp
*
* @author ******* ********
*
* This file contains the portability ugliness for the Standard C++
* Library. As implementations of the "standard" emerge, this file
* will need to be updated.
*
* This files deals with the streams includes.
*
*
*/
//=============================================================================
http://www.cs.wustl.edu/~schmidt/ACE_wrappers/ace/Global_Macros.h

To be fair, the ACE developers acknowledge the amount of macro hackery is
less than idea. They claim it is/was necessary for portability between
"implementations".
 
P

Pete Becker

Steven said:
Pete Becker wrote:




I'm not sure what the history of the paragraph is. It seems to have
originated in ANSI C, or perhaps ever earlier. The documentation that
descended from USL's documentation and implementation goes into extensive
detail in discussing the meaning of that paragraph. The wording may have
become vague over the years due to the changing context. It is probably
the case that it has been intentionally left unspecific in subsequent
revisions of the document.

The words you're talking about are in the C++ Standard. The C++
Standard's Committee wrote them, drawing on experience with the C
language standard. The words are deliberately vague, to avoid imposing
unnecessary restrictions.

Since you insist that there's an abstraction, here it is: compilation
units get translated somehow into something, and eventually those
somethings get linked with other stuff, called libraries, to create an
executable file.
 
S

Steven T. Hatton

Pete said:
The words you're talking about are in the C++ Standard. The C++
Standard's Committee wrote them, drawing on experience with the C
language standard. The words are deliberately vague, to avoid imposing
unnecessary restrictions.

The words are identical to those in the C Standard.
Since you insist that there's an abstraction, here it is: compilation
units get translated somehow into something, and eventually those
somethings get linked with other stuff, called libraries, to create an
executable file.

IYO.
 
S

Steven T. Hatton

Julián Albo said:
The standard does not care about that. Each operating system and family of
tools can have his own notions about how linking is done and librarys are
stored and handled.

You can view a library, in the linking sense, just as a couple of object
files stored togheter in some way.

Code can be linked by resolving symbols to relative or absolute adresses,
to references to symbols in dynamic loading libraries... Or can be not
resolved to any type of address, delaying that to loading or run time.

Hypothetically, we could construct a computer out of Buddhist monks and
birchbark scrolls by assigning each instruction in an instruction to a
different monk. Some of the scrolls could serve as harddrive storage,
others as main memory, still others as processor registers. executing a
program would entail sequentially executing instructions written on one or
more scrolls which are passed from monk to monk as needed.

Even in that fanciful computer, there would be an activity called 'linking'.
That linking would have some identifiable similarity to what is called
linking in other computing environments.

The following paragraph is from the 1997 draft C Standard:

"All external object and function references are resolved. Library
components are linked to satisfy external references to functions and
objects not defined in the current translation. All such translator output
is collected into a program image which contains information needed for
execution in its execution environment."

I do not know if the wording is the same in the C90. It does seem to
precede Standard C++, although it may have been influenced by the C++
standardization effort. My supposition is that the authors simply used the
terms they were accustomed to in order to specify what this final phase of
translation is. What I am interested in is what the essential meaning of
these notions is when transformed to a new environment.
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Steven said:
Hypothetically, we could construct a computer out of Buddhist monks and
birchbark scrolls by assigning each instruction in an instruction to a
different monk. Some of the scrolls could serve as harddrive storage,
others as main memory, still others as processor registers. executing a
program would entail sequentially executing instructions written on one or
more scrolls which are passed from monk to monk as needed.

Yes, it's easy to imagine silly ways to compute things. But is also possible
to invent ways practical and better than current.
Even in that fanciful computer, there would be an activity called
'linking'. That linking would have some identifiable similarity to what is
called linking in other computing environments.

I think that depends of the meaning assigned to the word 'linking'. Perhaps
in our imagined new computer or development tool there is nothing called
'linker', but we can stablish that some part of the process of creation of
a executable will be considered the linking phase in order to match the
conceptual process expected by the standard.
standardization effort. My supposition is that the authors simply used
the terms they were accustomed to in order to specify what this final
phase of translation is. What I am interested in is what the essential
meaning of these notions is when transformed to a new environment.

I think that the intent is not to have essential meaning, leaving the door
open to any new technique that can be introduced. If nobody has developped
a definition abstract enough to allow any conceivable and potentially
useful technique the better solution is to leave it whithout a rigorous
definition.
 
P

P.J. Plauger

The following paragraph is from the 1997 draft C Standard:

"All external object and function references are resolved. Library
components are linked to satisfy external references to functions and
objects not defined in the current translation. All such translator
output
is collected into a program image which contains information needed for
execution in its execution environment."

I do not know if the wording is the same in the C90. It does seem to
precede Standard C++, although it may have been influenced by the C++
standardization effort.
Hardly.

My supposition is that the authors simply used the
terms they were accustomed to in order to specify what this final phase of
translation is.

Chances are, I wrote those words about 20 years ago. They seem to have
done the job because:

1) The C committee has received not a single Defect Report caused
by confusion over them.

2) I've heard of not a single dispute between compiler vendors and
authors of C validation suites over the meaning of this paragraph.
What I am interested in is what the essential meaning of
these notions is when transformed to a new environment.

In that case, you'd be wise to listen to what people are trying to
tell you.

P.J. Plauger
Dinkumware, Ltd.
http://www.dinkumware.com
 
S

Steven T. Hatton

Julián Albo said:
Steven T. Hatton wrote:
Yes, it's easy to imagine silly ways to compute things. But is also
possible to invent ways practical and better than current.

Perhaps silly, but I find that such ideas expose aspects of what we really
mean by code execution, etc. Often there are patterns working at very
different scales of time and space which we give different names, but when
we look closely, they are actually manifestations of the same underlying
example.
I think that depends of the meaning assigned to the word 'linking'.
Perhaps in our imagined new computer or development tool there is nothing
called 'linker', but we can stablish that some part of the process of
creation of a executable will be considered the linking phase in order to
match the conceptual process expected by the standard.

Actually, the Standard doesn't require that step to actually take place, so
long as the observable results are as if the steps had taken place.
I think that the intent is not to have essential meaning, leaving the door
open to any new technique that can be introduced.

I believe that is the current intent. I don't know if it was as important
in the minds of the original authors of the paragraph.
If nobody has developped
a definition abstract enough to allow any conceivable and potentially
useful technique the better solution is to leave it whithout a rigorous
definition.

It's one thing to change the wording of the Standard, which I have not
proposed. It's quite another thing to attempt understanding what it really
implies.

"All external object and function references are resolved."

This is what the standard says about objects:

"An object is a region of storage. [Note: A function is not an object,
regardless of whether or not it occupies storage in the way that objects
do. ] An object is created by a definition (basic.def), by a new-expression
(expr.new) or by the implementation (class.temporary) when needed. The
properties of an object are determined when the object is created. An
object can have a name (clause basic). An object has a storage duration
(basic.stc) which influences its lifetime (basic.life). An object has a
type (basic.types)."

So what does "external object reference" actually mean? It doesn't seem
meaningful to talk about the object existing before the program is
executed. I'll have to think about what might constitute an external
object in this context. The part about it being resolved seems easier to
understand. It means determining where the object is defined in some other
library, object file, or similar.

"Library components are linked to satisfy external references to functions
and objects not defined in the current translation."

Would there be a case where an object is not named, but it still needs its
reference to be resolved? IOW, are all of these declarations for which
definition are not provided in the current translation unit?

"All such translator output is collected into a program image which contains
information needed for execution in its execution environment."

That part seems fairly straight forward to me right now.

To me it seem illadvised to completely ignore the relationship between
language facilities, and object file structure. I understand the
motivation for wanting to avoid over specifying details. OTOH, if there
really is an underlying structure that will be common to all systems, and
language constructs can facilitate better use of that structure, it would
be a mistake not incorporate them into the language.
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Steven said:
So what does "external object reference" actually mean? It doesn't seem
meaningful to talk about the object existing before the program is
executed. I'll have to think about what might constitute an external
object in this context. The part about it being resolved seems easier to
understand. It means determining where the object is defined in some
other library, object file, or similar.

I don't understand how you can consider that can be determined where the
object is and that the object does not exist at the same time. You can
define the "external object" as the thing that is located at where the
object is defined, for example.
To me it seem illadvised to completely ignore the relationship between
language facilities, and object file structure. I understand the
motivation for wanting to avoid over specifying details. OTOH, if there
really is an underlying structure that will be common to all systems, and
language constructs can facilitate better use of that structure, it would
be a mistake not incorporate them into the language.

I don't understand what are you proposing (or if you are proposing
something). Are you simply saying that C++ is not perfect? I agree. I think
the whole standard commitee will agree.
 
S

Steven T. Hatton

P.J. Plauger said:
Chances are, I wrote those words about 20 years ago. They seem to have
done the job because:

1) The C committee has received not a single Defect Report caused
by confusion over them.

I will point out that at least one person has proposed changing them. That
person is not me, BTW.
2) I've heard of not a single dispute between compiler vendors and
authors of C validation suites over the meaning of this paragraph.

I don't really see a reason why there should be any confusion about what
they mean in the context of writing a compiler. I do know there are
sometimes problems that arise when something that was originally linked
against a local library ends up being linked to something that requires
network I/O for every function call.
In that case, you'd be wise to listen to what people are trying to
tell you.

Different people are telling me different things. The advice of "don't try
to understand it" just doesn't make a lot of sense to me.

"I have reluctantly come to accept that some system-related issues would
have been better handled within C++. System-related issues, such as
dynamic linking of classes and interface evolution do not logically belong
in a language and language-based solutions are not preferable on technical
grounds. However, the language provides the only common forum in which a
truly standard solution can become accepted." - D&E §9.4.3

"assembly ?Refers to one or more files that are output by the compiler as a
result of program compilation. An assembly is a configured set of loadable
code modules and other resources that together implement a unit of
functionality. An assembly can contain types, the executable code used to
implement these types, and 20 references to other assemblies. The physical
representation of an assembly is defined by the CLI Standard (§3).
Essentially, an assembly is the output of the compiler. An assembly that
has an entry point is called an application. (See also ?metadata?.)"

Interesting.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top