Tentative definition versus external linkage

R

RG (Rafael Giusti)

I'm having some trouble understanding the rationale for C99 when it
comes to external linkage... The rationale says (Section 6.2.2, near
line 25):

"The Standard model is a combination of features of the strict ref/def
model and the initialization model. As in the strict ref/def model,
only a single translation unit contains the definition of a given
object because many environments cannot effectively or efficiently
support the "distributed definition" inherent in the common or relaxed
ref/def approaches. However, either an initialization, or an
appropriate declaration without storage class specifier (see §6.9),
serves as the external definition. This composite approach was chosen
to accommodate as wide a range of environments and existing
implementations as possible."

What confuses me is the the following assertion: "only a single
translation unit contains the definition of a given object". To help
me illustrate my point, here's a program, which consists of two
translation units:

/* slave.c */
#include <stdio.h>
int external_object;
void print(void)
{
printf("%d\n", external_object);
}

/* main.c */
int external_object;
void print(void);
int main(void)
{
external_object = 8;
print();
return 0;
}

Well, if I compile and run this program, I'll get 8 as output. No
problem so far.

However, if I am to analyze the output of nm for both slave.o and
main.o object files, here's what I get:

main.o:
00000004 C external_object
00000000 T main
U print

slave.o:
00000004 C external_object
00000000 T print
U printf

Now, the ANSI C99 standard says the following about declarations and
definitions (Section 6.7, §5):

A declaration specifies the interpretation and attributes of a set of
identifiers. A definition of an identifier is a declaration for that
identifier that:
-- for an object, causes storage to be reserved for that object;

Well, according to nm, both main.o and slave.o reserve storage for the
object external_object, and should be considered definitions of
external_object. Hence, GCC breaks the standard by allowing more than
one definition of the external_object and then letting the linker
decide that all but one definition of external_object are actually
declarations, just like the relaxed ref/def demands!

Is that possible? What am I missing here?

Thans!
rg
 
S

santosh

RG said:
I'm having some trouble understanding the rationale for C99 when it
comes to external linkage... The rationale says (Section 6.2.2, near
line 25):

"The Standard model is a combination of features of the strict ref/def
model and the initialization model. As in the strict ref/def model,
only a single translation unit contains the definition of a given
object because many environments cannot effectively or efficiently
support the "distributed definition" inherent in the common or relaxed
ref/def approaches. However, either an initialization, or an
appropriate declaration without storage class specifier (see §6.9),
serves as the external definition. This composite approach was chosen
to accommodate as wide a range of environments and existing
implementations as possible."

What confuses me is the the following assertion: "only a single
translation unit contains the definition of a given object". To help
me illustrate my point, here's a program, which consists of two
translation units:

/* slave.c */
#include <stdio.h>
int external_object;
void print(void)
{
printf("%d\n", external_object);
}

/* main.c */
int external_object;
void print(void);
int main(void)
{
external_object = 8;
print();
return 0;
}

Well, if I compile and run this program, I'll get 8 as output. No
problem so far.

However, if I am to analyze the output of nm for both slave.o and
main.o object files, here's what I get:

main.o:
00000004 C external_object
00000000 T main
U print

slave.o:
00000004 C external_object
00000000 T print
U printf

Now, the ANSI C99 standard says the following about declarations and
definitions (Section 6.7, §5):

A declaration specifies the interpretation and attributes of a set of
identifiers. A definition of an identifier is a declaration for that
identifier that:
-- for an object, causes storage to be reserved for that object;

Well, according to nm, both main.o and slave.o reserve storage for the
object external_object, and should be considered definitions of
external_object. Hence, GCC breaks the standard by allowing more than
one definition of the external_object and then letting the linker
decide that all but one definition of external_object are actually
declarations, just like the relaxed ref/def demands!

Is that possible? What am I missing here?

The 'external_object' is placed in the so called "common" section of the
object files. During runtime, all references to it are mapped to only
one actual object.
 
S

santosh

santosh wrote:

The 'external_object' is placed in the so called "common" section of
the object files. During runtime, all references to it are mapped to
only one actual object.

You can see this if you tell gcc to dump the assembler code generated
for both the source files by using the '-S' command option.
 
S

santosh

santosh said:
santosh wrote:



You can see this if you tell gcc to dump the assembler code generated
for both the source files by using the '-S' command option.

Forgot to add that an 'nm' on the executable file will demonstrate that
there is only one instance of 'external_object'.
 
R

RG (Rafael Giusti)


Well, that explains how GCC works, but doesn't seem to explain this:

"As in the strict ref/def model, only a single translation unit
contains the definition of a
given object because many environments cannot effectively or
efficiently support the
"distributed definition" inherent in the common or relaxed ref/def
approaches"

If GCC puts unitialized extern objects that do not include the extern
keyword in the common section, then all declarations/definitions are
equal and all of them reserve storage. So all of them are definitions.
That means GCC follow the relaxed ref/def, not the strict ref/def.
Either that or I'm still missing something very important.

Also, it seems to me that the only way to comply with the strict ref/
def would be requiring that the declaration of global_object in one of
the translation units included the word extern.
 
R

Ralf Damaschke

RG (Rafael Giusti) wrote:

[ using block scope "int external_object;" in two translation
units of a program ]
Now, the ANSI C99 standard says the following about
declarations and definitions (Section 6.7, §5):

A declaration specifies the interpretation and attributes of a
set of identifiers. A definition of an identifier is a
declaration for that identifier that:
-- for an object, causes storage to be reserved for that
object;

Well, according to nm, both main.o and slave.o reserve storage
for the object external_object, and should be considered
definitions of external_object. Hence, GCC breaks the standard
by allowing more than one definition of the external_object
and then letting the linker decide that all but one definition
of external_object are actually declarations, just like the
relaxed ref/def demands!

Is that possible? What am I missing here?

Yes, it is possible; in both translation units the tentative
definition of the object external_object eventually becomes an
external definition with initialization to 0 (6.9.2). Somewhat
above, 6.9 says under "Semantics":

| If an identifier declared with external linkage is used in an
| expression (other than as part of the operand of a sizeof
| operator whose result is an integer constant), somewhere in the
| entire program there shall be exactly one external definition
| for the identifier; otherwise, there shall be no more than one.

This is "shall" requirement outside of a constraint and its
violation leads to undefined behavior and does not need a
diagnostic. Thus the behavior you observed is allowed by the
C standard (I am not sure about POSIX requirements).

Ralf
 
R

RG (Rafael Giusti)

This is "shall" requirement outside of a constraint and its
violation leads to undefined behavior and does not need a
diagnostic. Thus the behavior you observed is allowed by the
C standard (I am not sure about POSIX requirements).

Hmmmm... so is it correct to say that GCC follows the relaxed ref/def
linkage model instead of the Standard linkage model?
 
R

Ralf Damaschke

RG said:
Hmmmm... so is it correct to say that GCC follows the relaxed
ref/def linkage model instead of the Standard linkage model?

Not really. As shown above, here it's the responsibility of the
programmer to follow the strict model, not that of the
implementation. At least you may say that the particular gcc
implementation allows you to use a relaxed (or common? I don't
know whether these are distinct) model in this particular usage.
Whether all gcc versions on all possible platforms do allow that,
well, that would be better asked in gnu.gcc.help.

Ralf
 
F

Flash Gordon

RG (Rafael Giusti) wrote, On 13/03/08 15:19:
Well, that explains how GCC works, but doesn't seem to explain this:

"As in the strict ref/def model, only a single translation unit
contains the definition of a
given object because many environments cannot effectively or
efficiently support the
"distributed definition" inherent in the common or relaxed ref/def
approaches"

If GCC puts unitialized extern objects that do not include the extern
keyword in the common section, then all declarations/definitions are
equal and all of them reserve storage. So all of them are definitions.
That means GCC follow the relaxed ref/def, not the strict ref/def.
Either that or I'm still missing something very important.

The important thing you are missing is implementations are not required
to diagnose (complain about) all programming errors. Specifically your
code invokes "undefined behaviour" and does not require a diagnostic
(error, warning, informational message, kick in the teeth etc). So gcc
(and the linker) not reporting the problem is not an error in gcc (or
the linker), and because it is undefined behaviour *anything* is allowed
to happen, including the program behaving as you describe.
Also, it seems to me that the only way to comply with the strict ref/
def would be requiring that the declaration of global_object in one of
the translation units included the word extern.

On some systems you can get gcc to compile the code such that the linker
will report an error with your code. Just the gcc documentation (or
search this group) for the switch as I'm too lazy at the moment to find
it myself.
 
L

lawrence.jones

RG (Rafael Giusti) said:
Hmmmm... so is it correct to say that GCC follows the relaxed ref/def
linkage model instead of the Standard linkage model?

Nearly -- it would be correct to say that GCC, as you invoked it,
follows the relaxed ref/def model in your particular environment. The
linkage model is usually determined by other components of the system
(like the linker) rather than by the compiler itself. In an environment
that supports multiple linkage models, GCC might well provide options to
select which one to use. The relaxed ref/def model is traditional on
Unix-like systems and might even be required by POSIX. Note that the C
Standard does not require an implementation to use any particular
linkage model, its requirements only dictate how maximally portable
programs must be written.

-Larry Jones

I think my cerebellum just fused. -- Calvin
 
J

Jack Klein

I'm having some trouble understanding the rationale for C99 when it
comes to external linkage... The rationale says (Section 6.2.2, near
line 25):

"The Standard model is a combination of features of the strict ref/def
model and the initialization model. As in the strict ref/def model,
only a single translation unit contains the definition of a given
object because many environments cannot effectively or efficiently
support the "distributed definition" inherent in the common or relaxed
ref/def approaches. However, either an initialization, or an
appropriate declaration without storage class specifier (see §6.9),
serves as the external definition. This composite approach was chosen
to accommodate as wide a range of environments and existing
implementations as possible."

What confuses me is the the following assertion: "only a single
translation unit contains the definition of a given object". To help
me illustrate my point, here's a program, which consists of two
translation units:

/* slave.c */
#include <stdio.h>
int external_object;
void print(void)
{
printf("%d\n", external_object);
}

/* main.c */
int external_object;
void print(void);
int main(void)
{
external_object = 8;
print();
return 0;
}

Well, if I compile and run this program, I'll get 8 as output. No
problem so far.

However, if I am to analyze the output of nm for both slave.o and
main.o object files, here's what I get:

main.o:
00000004 C external_object
00000000 T main
U print

slave.o:
00000004 C external_object
00000000 T print
U printf

Now, the ANSI C99 standard says the following about declarations and
definitions (Section 6.7, §5):

A declaration specifies the interpretation and attributes of a set of
identifiers. A definition of an identifier is a declaration for that
identifier that:
-- for an object, causes storage to be reserved for that object;

Well, according to nm, both main.o and slave.o reserve storage for the
object external_object, and should be considered definitions of
external_object. Hence, GCC breaks the standard by allowing more than
one definition of the external_object and then letting the linker
decide that all but one definition of external_object are actually
declarations, just like the relaxed ref/def demands!

Is that possible? What am I missing here?

Thans!
rg

You are missing the simple fact that the C standard statement is, in
6.9 p5:

"An external definition is an external declaration that is also a
definition of a function (other than an inline definition) or an
object. If an identifier declared with external linkage is used in an
expression (other than as part of the operand of a sizeof operator
whose result is an integer constant), somewhere in the entire program
there shall be exactly one external definition for the identifier;
otherwise, there shall be no more than one."

If you define an external object in more than one translation, you are
violating the first "shall" clause in this paragraph. This paragraph
is in a "Semantics" section, not a "Constraints" section.

Violating a "shall" clause outside of a constraints section means that
the program produces undefined behavior, no diagnostic is required.

It just so happens that one possible consequence of undefined behavior
is for the program to do what you expect and/or want it to do.

How the tool set you use deals with a particular case of undefined
behavior is up to the tool set, since the C standard imposes no
requirements.

It just so happens that some tool sets, on some platforms, use a
linkage model that allows multiple external definitions, as long as
none of them or only one of them, but not more than one, specify an
initializer. The linker will merge duplicate external definitions
into a single object.

It also just so happens that there are a large number of other
implementations that will reject this at the link stage, complaining
of multiple definitions.

Either behavior is allowed by the C standard, as is just about any
other, because the behavior is undefined by the standard.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://c-faq.com/
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top