Why Does C++ Name-Mangle Identifiers?

Karl Heinz Buchegger · Nov 2, 2004

Randy said:
Why is this necessary? The identifiers provide by the
programmer are unique, and that is what is important - why
map what the user has specified to something else?

Because most C++ compilers are built around a system dependent
linker. Those linker have a simple rule: if 2 functions have
the same name, then they are the same function. This was
true in most (if not all) programming languages which
required linking.

But in C++ it is perfectly valid to have 2 different functions
with the same name, if only their argument types are different.

That's why name mangling was introduced: To add information to
the name of a function to create different names for functions
with the same name but different argument types.

The alternative would have been to write new a new object file
format and a new linker that can deal with that. As long as you
only have C++ this would be no big problem. But as said: on most
'big machines' (such as mainframes), there is only one object
file format defined and only one system wide linker to link
an application. That's a good thing, because it means you can
freely link modules written in different programming languages.
But it also means: You have to introduce some mechanism for
the compiler to make function names unique even if they have identical
names in the source code.

Karl Heinz Buchegger · Nov 2, 2004

osmium said:
What a great post! Even though I am a mainframer by nature I had never
considered the fundamental difference between a mainframe and a PC or work
station.

I first noticed this a 1.5 centuries ago, when people in the company I was
working for linked Fortran with Lisp and C code on a VAX/VMS and were able
to debug the program with the system wide debugger. It worked like a
charm and it made me think.

Randy Yates · Nov 4, 2004

Why is this necessary? The identifiers provide by the
programmer are unique, and that is what is important - why
map what the user has specified to something else?

Gianni Mariani · Nov 4, 2004

Randy said:
Why is this necessary? The identifiers provide by the
programmer are unique, and that is what is important - why
map what the user has specified to something else?

void X();
void X( int );

Which linker symbol should be X ?

John Harrison · Nov 4, 2004

Randy Yates said:
Why is this necessary? The identifiers provide by the
programmer are unique, and that is what is important - why
map what the user has specified to something else?

But they aren't.

int my_overloaded_function(int x)
{
return x;
}

int my_overloaded_function(int x, int y)
{
return x + y;
}

In C++ it's legal to define two different functions with the same name.
Typically C++ implementations mangle the names so non-C++ aware linkers can
distinguish the functions.

john

Rolf Magnus · Nov 4, 2004

Randy said:
Why is this necessary? The identifiers provide by the
programmer are unique,

No, they aren't.

and that is what is important - why map what the user has specified to
something else?

Because they aren't unique at all.

void draw()
{
std::cout << "drawing\n;";
}

void draw(int id)
{
std::cout << "drawing object with id " << id << '\n';
}

class Triangle
{
public:
void draw()
{
std::cout << "drawing a triangle\n";
}

void draw() const
{
std::cout << "dawing a constant triangle\n";
};

namespace SomeAPI
{
void draw()
{
std::cout << "yet another function with name 'draw'\n";
}
}

Now, each of those functions is called draw. How would you resolve that
ambiguity?

osmium · Nov 4, 2004

Karl Heinz Buchegger said:
Because most C++ compilers are built around a system dependent
linker. Those linker have a simple rule: if 2 functions have
the same name, then they are the same function. This was
true in most (if not all) programming languages which
required linking.

But in C++ it is perfectly valid to have 2 different functions
with the same name, if only their argument types are different.

That's why name mangling was introduced: To add information to
the name of a function to create different names for functions
with the same name but different argument types.

The alternative would have been to write new a new object file
format and a new linker that can deal with that. As long as you
only have C++ this would be no big problem. But as said: on most
'big machines' (such as mainframes), there is only one object
file format defined and only one system wide linker to link
an application. That's a good thing, because it means you can
freely link modules written in different programming languages.
But it also means: You have to introduce some mechanism for
the compiler to make function names unique even if they have identical
names in the source code.

What a great post! Even though I am a mainframer by nature I had never
considered the fundamental difference between a mainframe and a PC or work
station.

Mike Wahler · Nov 4, 2004

Randy Yates said:
Why is this necessary? The identifiers provide by the
programmer are unique,

Not always.

and that is what is important - why

map what the user has specified to something else?

void foo(int arg);
void foo(double arg);
void foo(int arg1, int arg2);

Look up 'function overloading'.

-Mike

Default User · Nov 4, 2004

Randy said:
Why is this necessary? The identifiers provide by the
programmer are unique, and that is what is important - why
map what the user has specified to something else?

As others have pointed out, your contention is untrue. That's the
reason C++ provides extern "C".

This tells the compiler not to mangle the name, that way it's available
to link with other languages that aren't expecting name-mangling.
Brian

Randy Yates · Nov 4, 2004

Default User said:
As others have pointed out, your contention is untrue. That's the
reason C++ provides extern "C".

This tells the compiler not to mangle the name, that way it's available
to link with other languages that aren't expecting name-mangling.
Brian

Yes, it's obvious now that you have pointed it out. Thanks everyone!

Thanks for this tip, D.U. - that was precisely what caused me to make this
post - linking C and C+ objects together and getting a problem. I
solved it a different way - I compiled the C program with the C++
compiler - it generates the same mangled name.

But actually now this gives me another question: Is the manner
in which names are mangled standardized? Otherwise how would
code from two different compiler vendors be linkable?

Greg Schmidt · Nov 4, 2004

But actually now this gives me another question: Is the manner
in which names are mangled standardized?
No.

Otherwise how would
code from two different compiler vendors be linkable?

They are not, in general, for more reasons than just name mangling
differences.

Ivan Vecerina · Nov 4, 2004

Randy Yates said:
But actually now this gives me another question: Is the manner
in which names are mangled standardized? Otherwise how would
code from two different compiler vendors be linkable?

No name mangling technique is specified in the ISO C++ standard.
This is because platform-specific techniques are often used,
and the concept of a linker or debugger is out of the scope
of the standard.

However, many platforms/processors/OSes define a standard name
mangling technique that all compilers are required(or invited)
to implement, as well as other aspects required for compatibility
(such as parameter passing convention, exception handling, etc).
This defines an ABI (Application Binary Interface, IIRC).

Try searching for: C++ ABI

hth,
Ivan

Randy Yates · Nov 4, 2004

Greg Schmidt said:
They are not, in general, for more reasons than just name mangling
differences.

So if I e.g. develop an ODBC library, I must deliver the source and
let each client compile for his target? That's hard to believe.

Serge Paccalin · Nov 4, 2004

Le jeudi 4 novembre 2004 à 20:33:00, Randy Yates a écrit dans
comp.lang.c++ :

Calling conventions differ from a compiler to another. To prevent
unintentional mixes, symbol mangling is intentionally different too.

So if I e.g. develop an ODBC library, I must deliver the source and
let each client compile for his target? That's hard to believe.

That's one solution. The other common solution is to deliver binaries
targeted for specific compilers (the compiler each binary has been built
with).

--
___________ 2004-11-04 22:20:07
_/ _ \_`_`_`_) Serge PACCALIN -- sp ad mailclub.net
\ \_L_) Il faut donc que les hommes commencent
-'(__) par n'être pas fanatiques pour mériter
_/___(_) la tolérance. -- Voltaire, 1763

Jesper Madsen · Nov 4, 2004

You are always free to supply a library as a "shared library/dll" adapted to
the different kinds of OS'es you support, but supply a C interface for the
library, and if your are really generous, a set of C++ wrapper classes as
source code. C naming in C++ is accomplished by using ..
extern "C" {
void available_Function();
}
If you want to build a shared library do your self a favor and build a
proper C interface, where all function parameters are PODs and objects are
opaque types. (look at how operating systems API's look like..)
And be sure to specify how the library has been compiled (e.g. thread safe
libraries, with libraries as .dll's
any settings for size of int, float, double or bool... If you pass a struct,
assert that the size of the struct is the same size from the user of your
library, to the your library. (check libJPEG for something like that)
Most .lib files are specific completely compiler specific, and some vendors
choose a few compilers/dev environments to support, and offer .lib files for
those, and supply ActiveX components for all other environments.

Mike Smith · Nov 4, 2004

John said:
But they aren't.

int my_overloaded_function(int x)
{
return x;
}

int my_overloaded_function(int x, int y)
{
return x + y;
}

In C++ it's legal to define two different functions with the same name.
Typically C++ implementations mangle the names so non-C++ aware linkers can
distinguish the functions.

I guess another question would be: why allow implementations to come up
with their own weird incompatible mangling schemes? Why doesn't the
Standard specify a uniform scheme - preferably something simple wherein,
for instance, the two functions above might become:

int!my_overloaded_function(int)

int!my_overloaded_function(int,int)

or something similarly uniform and simple to understand.

E. Robert Tisdale · Nov 4, 2004

Mike said:
Not always.

and that is what is important - why

void foo(int arg);
void foo(double arg);
void foo(int arg1, int arg2);

Look up 'function overloading'.

cat foo.cc

void foo(int arg) { }
void foo(double arg) { }
void foo(int arg1, int arg2) { }

g++ -Wall -ansi -pedantic -c foo.cc
nm foo.o

00000006 T _Z3food
00000000 T _Z3fooi
0000000c T _Z3fooii

c++filt _Z3food foo(double)
c++filt _Z3fooi foo(int)
c++filt _Z3fooii

foo(int, int)

Evidently, the function names are actually:

1. foo(double),
2. foo(int) and
3. foo(int, int)

respectively.
[Notice that the return type is part of the function name.]
These identifier strings
are *not* acceptable symbols for some link editors
so the C++ compiler *mangles* them --
usually by *decorating* the base function name
(foo in this case) with prefixes and/or suffixes --
to construct a unique symbol acceptable to the link editor.

Perhaps, eventually, link editors will be able to accept
the actual function name as a symbol
and mangling will no longer be required.

E. Robert Tisdale · Nov 4, 2004

Mike said:
I guess another question would be,
"Why allow implementations to come up with
their own weird incompatible mangling schemes?"
Why doesn't the Standard specify a uniform scheme?"
Preferably something simple wherein,
for instance, the two functions above might become:

int!my_overloaded_function(int)
my_overloaded_function(int)

int!my_overloaded_function(int,int)

my_overloaded_function(int, int)

Function resolution is independent of return type.

or something similarly uniform and simple to understand.

The problem is that not all link editors recognize such names
as acceptable symbols. This problem appears to be resolving itself
and soon mangling may no longer be necessary.

David Lindauer · Nov 4, 2004

overloaded functions and operators can have the same names but be unique
identifiers.

David

Jack Klein · Nov 5, 2004

I guess another question would be: why allow implementations to come up
with their own weird incompatible mangling schemes? Why doesn't the
Standard specify a uniform scheme - preferably something simple wherein,
for instance, the two functions above might become:

int!my_overloaded_function(int)

int!my_overloaded_function(int,int)

or something similarly uniform and simple to understand.

First of all, if you try to mandate this then why not mandate object
file format (OMF86, DWARF, ELF, COFF, etc.)? Why not mandate the
instruction set, for example 32-bit X86, and to heck with the
riff-raff using PowerPC or ARM or SPARC.

If you read Karl's reply to the original post (or reread it), you will
note that on some systems security concerns mandate the use of system
supplied linker. What if that linker does not support the '!'
character in symbol names?

There isn't even a general language standard for the names of C
external symbols (some preface an '_' character, some add one as a
suffix, some do something else or don't modify the symbol at all).

Compilers c c++	3	May 23, 2023
Atoms, Identifiers, and Primaries	21	Apr 17, 2013
Why does spacing matter in this context?	0	Aug 1, 2022
Generating valid identifiers	8	Jul 26, 2012
[C#] Extend main interface on child level	0	Aug 31, 2023
How to change key name in json file with python	0	Oct 2, 2022
C++ grammar: universal-character-name in identifiers	4	Sep 6, 2009
What's the detailed explanation for why the 1st function is correct and the 2nd is wrong?	3	Dec 16, 2022

Why Does C++ Name-Mangle Identifiers?

Karl Heinz Buchegger

Karl Heinz Buchegger

Randy Yates

Gianni Mariani

John Harrison

Rolf Magnus

osmium

Mike Wahler

Default User

Randy Yates

Greg Schmidt

Ivan Vecerina

Randy Yates

Serge Paccalin

Jesper Madsen

Mike Smith

E. Robert Tisdale

E. Robert Tisdale

David Lindauer

Jack Klein

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads