F
Frederick Gotham
When I was a beginner in C++, I struggled with the idea of references.
Having learned how to use pointers first, I was hesitant to accept that
references just "do their job" and that's it.
Just recently, a poster posted looking for an explanation of references.
I'll give my own understanding if it's worth anything.
First of all, the C++ Standard is a very flexible thing. It gives a
mountain of freedom to implementations to do things however they like, just
so long as they achieve the objective. Consider virtual functions for
instance -- the Standard doesn't mention hidden pointers within objects or
V-tables, even though all implementations that I know of achieve the
behaviour of virtual functions by these means.
Let's start off with very simple code:
int main()
{
int i = 5;
int &r = i;
r = 7;
}
If someone is familiar with pointers, and also familiar with how computers
actually work (i.e. CPU instructions, registers, etc.), then they might
think that the code is treated as if it were written:
int main()
{
int i = 5;
int *const r = &i;
*r = 7;
}
Would this be a correct way of thinking about it? Yes, I suppose. Then
again, there are other ways of achieving the objective. For all you know,
the compiler might just look at the definition of "r" and think, "Hmm... r
is just i", and change it to:
int main()
{
int i = 5;
i = 7;
}
It might even achieve it internally something like:
int main()
{
int i = 5;
#define r i
r = 7;
#undef r
}
Who knows? All that matters is that the implementation accurately achieves
the behaviour of references.
References are most useful when passing objects, and returning objects,
from functions:
void Func(int &i)
{
i = 5;
}
How does the compiler compile this? Well, the C++ Standard doesn't say how.
In a way, I think of references as magical little things that just get the
job done.
However, as a person who's familiar with how computers actually work, I
know that there must be some sort of indirection involved (i.e. pointers)
if the function is not inline. Therefore, I would presume that the compiler
does it something like:
void __Func(int *const p)
{
*p = 5;
}
#define Func(i) __Func(&(i))
Then again, the compiler might have some new-fangled way of getting this
done, who knows?! All that's important is that the job gets done.
Also, with returning objects by reference:
int &Func()
{
static int i;
return i;
}
int main()
{
Func() = 7;
}
How does the compiler make this work? Well, if there's no function inlining
involved, then I'd presume it does something like:
int *Func()
{
static int i;
return &i;
}
int main()
{
*Func() = 7;
}
This would be one way for the compiler to achieve its objective. The fact
of the matter though is that references are little puffs of pixie dust.
They don't make sense, and they can't be explained. They just do what they
do.
References are special in one way though. When you initialise a reference
with an R-value, something special happens:
int &r = 5; /* Won't compile */
but:
int const &r = 5; /* No problem */
You might think this is strange -- how could it possibly work? How could it
be equivalent to something like:
int const *const p = &(5);
The answer is simple: It isn't. A "reference to const" is special if it is
initialised with an R-value. The original line, int &r = 5;, is treated as
if you wrote:
int const __literal = 5;
int const &r = __literal;
Now, you can see that the "temporary" object has the same lifetime as the
reference.
One more thing I'll mention before I go... someone was interested by the
following code:
struct MyStruct {
int &a;
double &b;
};
#include <ostream>
#include <iostream>
int main()
{
int i; double d;
MyStruct ms = {i,d};
ms.a = 5;
std::cout << sizeof ms << std::endl;
}
The person was interested as to why "ms" had a particular size. Answer: The
C++ Standard doesn't care. In reality though, the type, "MyStruct", must
achieve its objective. With a knowledge of how computers work, I can see
that the handiest way to achieve this particular objective would probably
be to use pointers internally:
struct MyStruct {
int *const a;
double *const b;
};
#include <ostream>
#include <iostream>
int main()
{
int i; double d;
MyStruct ms = {&i,&d};
*ms.a = 5;
std::cout << sizeof ms << std::endl;
}
This would be one way for the compiler to achieve the behaviour of
"MyStruct". But again, it doesn't have to do it this way.
References are magical pixie dust as the end of the day, they have no
foundation in actual computer science -- they just do what they do.
Having learned how to use pointers first, I was hesitant to accept that
references just "do their job" and that's it.
Just recently, a poster posted looking for an explanation of references.
I'll give my own understanding if it's worth anything.
First of all, the C++ Standard is a very flexible thing. It gives a
mountain of freedom to implementations to do things however they like, just
so long as they achieve the objective. Consider virtual functions for
instance -- the Standard doesn't mention hidden pointers within objects or
V-tables, even though all implementations that I know of achieve the
behaviour of virtual functions by these means.
Let's start off with very simple code:
int main()
{
int i = 5;
int &r = i;
r = 7;
}
If someone is familiar with pointers, and also familiar with how computers
actually work (i.e. CPU instructions, registers, etc.), then they might
think that the code is treated as if it were written:
int main()
{
int i = 5;
int *const r = &i;
*r = 7;
}
Would this be a correct way of thinking about it? Yes, I suppose. Then
again, there are other ways of achieving the objective. For all you know,
the compiler might just look at the definition of "r" and think, "Hmm... r
is just i", and change it to:
int main()
{
int i = 5;
i = 7;
}
It might even achieve it internally something like:
int main()
{
int i = 5;
#define r i
r = 7;
#undef r
}
Who knows? All that matters is that the implementation accurately achieves
the behaviour of references.
References are most useful when passing objects, and returning objects,
from functions:
void Func(int &i)
{
i = 5;
}
How does the compiler compile this? Well, the C++ Standard doesn't say how.
In a way, I think of references as magical little things that just get the
job done.
However, as a person who's familiar with how computers actually work, I
know that there must be some sort of indirection involved (i.e. pointers)
if the function is not inline. Therefore, I would presume that the compiler
does it something like:
void __Func(int *const p)
{
*p = 5;
}
#define Func(i) __Func(&(i))
Then again, the compiler might have some new-fangled way of getting this
done, who knows?! All that's important is that the job gets done.
Also, with returning objects by reference:
int &Func()
{
static int i;
return i;
}
int main()
{
Func() = 7;
}
How does the compiler make this work? Well, if there's no function inlining
involved, then I'd presume it does something like:
int *Func()
{
static int i;
return &i;
}
int main()
{
*Func() = 7;
}
This would be one way for the compiler to achieve its objective. The fact
of the matter though is that references are little puffs of pixie dust.
They don't make sense, and they can't be explained. They just do what they
do.
References are special in one way though. When you initialise a reference
with an R-value, something special happens:
int &r = 5; /* Won't compile */
but:
int const &r = 5; /* No problem */
You might think this is strange -- how could it possibly work? How could it
be equivalent to something like:
int const *const p = &(5);
The answer is simple: It isn't. A "reference to const" is special if it is
initialised with an R-value. The original line, int &r = 5;, is treated as
if you wrote:
int const __literal = 5;
int const &r = __literal;
Now, you can see that the "temporary" object has the same lifetime as the
reference.
One more thing I'll mention before I go... someone was interested by the
following code:
struct MyStruct {
int &a;
double &b;
};
#include <ostream>
#include <iostream>
int main()
{
int i; double d;
MyStruct ms = {i,d};
ms.a = 5;
std::cout << sizeof ms << std::endl;
}
The person was interested as to why "ms" had a particular size. Answer: The
C++ Standard doesn't care. In reality though, the type, "MyStruct", must
achieve its objective. With a knowledge of how computers work, I can see
that the handiest way to achieve this particular objective would probably
be to use pointers internally:
struct MyStruct {
int *const a;
double *const b;
};
#include <ostream>
#include <iostream>
int main()
{
int i; double d;
MyStruct ms = {&i,&d};
*ms.a = 5;
std::cout << sizeof ms << std::endl;
}
This would be one way for the compiler to achieve the behaviour of
"MyStruct". But again, it doesn't have to do it this way.
References are magical pixie dust as the end of the day, they have no
foundation in actual computer science -- they just do what they do.