About the address offset when assign address of a derived class to aBase class pointer.

J

junyangzou

Say we have two base class Base1, Base2 and a Derived : public Base1, public Base2.
When we assign the address of a Derived Object to a Base2 Pointer. The address need to be adjust to the beginning of the Base2:
Base2 *pBase2;
Derived d;
pBase2 = &d;

In my machine, the value of pBase2 and &d is 0079FAB0 0079FAB8 respectively, indicating that sizeof( Base1 ) is 8 bytes.

But when and where is the offset added? This can not be done in compiling time. Because though the example above is quite straightforward. Some times we may write code like:
pBase2 = Factory.get();

We known the real type of the returned object only in rumtime. So anyone can explain when and where is the offset added? Thanks!
 
T

Tobias Müller

junyangzou said:
Say we have two base class Base1, Base2 and a Derived : public Base1, public Base2.
When we assign the address of a Derived Object to a Base2 Pointer. The
address need to be adjust to the beginning of the Base2:


In my machine, the value of pBase2 and &d is 0079FAB0 0079FAB8
respectively, indicating that sizeof( Base1 ) is 8 bytes.

But when and where is the offset added? This can not be done in compiling
time. Because though the example above is quite straightforward. Some
times we may write code like:

We known the real type of the returned object only in rumtime. So anyone
can explain when and where is the offset added? Thanks!

The "real" type of the object actually doesn't matter. What matters is the
(statically known) type of the return value of Factory.get(). This type has
to be a common base class of all possible "real" types and the pointer
adjustment from the "real" type to that common base type happens _inside_
the Factory.get() method, where the "real" type is statically known.

Tobi
 
J

junyangzou

The "real" type of the object actually doesn't matter. What matters is the

(statically known) type of the return value of Factory.get(). This type has

to be a common base class of all possible "real" types and the pointer

adjustment from the "real" type to that common base type happens _inside_

the Factory.get() method, where the "real" type is statically known.



Tobi

This make sense. So if the Factory method returns an pointer to Derived( Derived* ) in which the real object is an instance of a derived class from Derived, say Derived2. The pointer may be( if needed ) offset twice?
 
J

junyangzou

Yes, if there is again multiple derivation involved. E.g.



class Base1 {...};

class Base2 {...};

class Base3 {...};



class Derived1: public Base1, public Base2 {...};



class Derived2: public Base3, public Derived1 {...};



if we now have:



Derived2* x = new Derived2();

Derived1* y = x;

Base2* z = y;



then a typical compiler would translate this into something approximately

like this (space optimizations might change this a bit):



// pseudocode:

x = ...

y = x + sizeof(Base3);

z = y + sizeof(Base1);



Note that after the adjustment has been done, the rest of the code

working with the base class pointer does not have any a priori knowledge

that the complete object is Derived2. If this becomes important, one must

add relevant virtual functions to classes so that all the subobjects

would have vtable pointers where the code can dig up this information at

run-time, for example when processing things like



Derived2* w = dynamic_cast<Derived2*>(z);



or



delete z;

Yes, now I get it. I am reading the implementation details of vtables, and see in some case the virtual function need to adjust `this` pointer to the original address. That is why am I wandering when is it offset from the original, hah. Thank you!
 
J

junyangzou

If the get() function returns a pointer of type Base2, then it will adjust

said value before returning it.



If the get() function returns a pointer of type Dervied, then that

assignment will adjust the value.



--- news://freenews.netfront.net/ - complaints: (e-mail address removed) ---

Yep, I was trying to understand it in a more complicated inheritance hierarchy, you can refer to the example given by Paavo, but basically your thought is the same.
 
Ö

Öö Tiib

Say we have two base class Base1, Base2 and a Derived : public Base1,
public Base2.

That means in every object of type 'Derived' there is base sub-object
of type 'Base1' and other is of type 'Base2'.
When we assign the address of a Derived Object to a Base2 Pointer. The
address need to be adjust to the beginning of the Base2:


In my machine, the value of pBase2 and &d is 0079FAB0 0079FAB8 respectively,
indicating that sizeof( Base1 ) is 8 bytes.

Not exactly. Your experiment indicates that offset of 'Base2' base
sub-object in object of 'Derived' is 8 bytes. Difference between offsets
of sub-objects does not indicate size of sub-objects accurately since
there may be padding (for achieving alignment) between the sub-objects.
But when and where is the offset added?

There is implicit cast to pointer of base class.

pBase2 = static_cast<Base2*>(&d);

C++ language rules do not require us to write that cast out explicitly so
we can write:

pBase2 = &d;

That cast simply adds the offset to 'Base2' sub-object in 'Derived' to
pointer returned by that unary operator&.
This can not be done in compiling time. Because though the example above
is quite straightforward. Some times we may write code like:


We known the real type of the returned object only in rumtime. So anyone
can explain when and where is the offset added? Thanks!

The pointer always actually points at object of type 'Base2' regardless
of exact type of the derived object that contains that sub-object.
 
J

junyangzou

Not exactly. Your experiment indicates that offset of 'Base2' base
sub-object in object of 'Derived' is 8 bytes. Difference between offsets

of sub-objects does not indicate size of sub-objects accurately since

there may be padding (for achieving alignment) between the sub-objects.

I think the alignment is always there. And sizeof(Base) will yield out the size after alignment.
There is implicit cast to pointer of base class.



pBase2 = static_cast<Base2*>(&d);



C++ language rules do not require us to write that cast out explicitly so

we can write:



pBase2 = &d;



That cast simply adds the offset to 'Base2' sub-object in 'Derived' to

pointer returned by that unary operator&.
The pointer always actually points at object of type 'Base2' regardless

of exact type of the derived object that contains that sub-object.

+1 for the static cast, it's more intuitive.
 
Ö

Öö Tiib

Base2 might need a stricter alignment than Base1, sizeof(Base1) will
only include the alignment needed for Base1.

For example, if base1 only had a char, and base2 had a a double, base1
might be able to be aligned on a char boundary (if it has a virtual
function, it probably goes up to the size of a vtable pointer, which
might be 4 bytes for a 32 bit system).

Also not quite. ;) Without vtable your example is 1 and 8 on typical
desktop platform. Here is code:

#include <iostream>

struct X {char c;};
struct Y {double d;};
struct Z : X, Y {};

int main()
{
Z z;
std::cout << "size of X: " << sizeof(X)
<< "\noffset of Y: " << (char*)(Y*)&z-(char*)&z
<< std::endl;
}

And here is output:

size of X: 1
offset of Y: 8

However with vtable the alignment requirement of X grows to 4 on 32 bit
platform so sizeof X grows to 8 (with 3 bytes padding at end). Replace:

struct X {char c; virtual void f() {}};

And here is output:

size of X: 8
offset of Y: 8

Experiment like that is may be what confused junyangzou.
 
J

junyangzou

Also not quite. ;) Without vtable your example is 1 and 8 on typical

desktop platform. Here is code:



#include <iostream>



struct X {char c;};

struct Y {double d;};

struct Z : X, Y {};



int main()

{

Z z;

std::cout << "size of X: " << sizeof(X)

<< "\noffset of Y: " << (char*)(Y*)&z-(char*)&z

<< std::endl;

}



And here is output:



size of X: 1

offset of Y: 8



However with vtable the alignment requirement of X grows to 4 on 32 bit

platform so sizeof X grows to 8 (with 3 bytes padding at end). Replace:



struct X {char c; virtual void f() {}};



And here is output:



size of X: 8

offset of Y: 8



Experiment like that is may be what confused junyangzou.

Yes, this make sense. The POD type is more straightforward in such analysis of alignment.
1. The member itself should align to their size,
2. yet the overall size should also align to the largest requirement of the member.
I found this general rules useful.
The data member alignment requirement is trivial, it's obvious. For the second item, it is because if we push the object into an array. This item will make sure that the later elements will be aligned correctly.
Say a struct{char, double};
0------8------16------24------32
|double charxxx|double charxxx|
without the second rule, this object will not be correctly aligned for the second double.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,439
Members
44,829
Latest member
PIXThurman

Latest Threads

Top