C++ vtables - Part 2 - Multiple Inheritance
(934 words) Tue, Mar 8, 2016The world of single-parent inheritance hierarchies is simpler for the compiler. As we saw in Part 1, each child class extends its parent vtable by appending entries for each new virtual method.
In this post we will cover multiple inheritance, which complicates things even when only inheriting from pure-interfaces.
Let’s look at the following piece of code:
class Mother {
public:
virtual void MotherMethod() {}
int mother_data;
};
class Father {
public:
virtual void FatherMethod() {}
int father_data;
};
class Child : public Mother, public Father {
public:
virtual void ChildMethod() {}
int child_data;
};
Child ’s layout |
---|
_vptr$Mother |
mother_data (+ padding) |
_vptr$Father |
father_data |
child_data1 |
Note that there are 2 vtable pointers. Intuitively I’d expect either 1 or 3
pointers (Mother
, Father
and Child
). In reality it’s impossible to have a
single pointer (more on this soon), and the compiler is smart enough to combine
Child
’s vtable entries as a continuation of Mother
’s vtable, thus saving 1
pointer.
Why can’t Child
have one vtable pointer for all 3 types? Remember that a
Child
pointer can be passed to a function accepting a Mother
pointer or a
Father
pointer, and both will expect the this
pointer to hold the correct
data in the correct offsets. These functions don’t necessarily know of Child
,
and definitely shouldn’t assume that a Child
is really what’s underneath the
Mother
/Father
pointer they have in their hands.
1 Unrelated to this topic, but interesting nontheless, is that
child_data
is actually placed inside Father
’s padding. This is called ‘tail
padding’, and might be the topic of a future post.
Here’s the vtable layout:
Address | Value | Meaning |
---|---|---|
0x4008b8 | 0 | top_offset (more on this later) |
0x4008c0 | 0x400930 | pointer to typeinfo for Child |
0x4008c8 | 0x400800 | Mother::MotherMethod() . _vptr$Mother points here. |
0x4008d0 | 0x400810 | Child::ChildMethod() |
0x4008d8 | -16 | top_offset (more on this later) |
0x4008e0 | 0x400930 | pointer to typeinfo for Child |
0x4008e8 | 0x400820 | Father::FatherMethod() . _vptr$Father points here. |
In this example, an instance of Child
will have the same pointer when casted
to a Mother
pointer. But when casting to a Father
pointer the compiler
calculates an offset of the this
pointer to point to the _vptr$Father
part
of Child
(3rd field in Child
’s layout, see table above).
In other words, for a given Child c;
: (void*)&c !=
(void*)static_cast<Father*>(&c)
. Some people don’t expect this, and maybe some
day this information will save you some debugging time. I found it useful more
than once.
But wait, there’s more.
What if Child
decided to override one of Father
’s methods? Consider this
code:
class Mother {
public:
virtual void MotherFoo() {}
};
class Father {
public:
virtual void FatherFoo() {}
};
class Child : public Mother, public Father {
public:
void FatherFoo() override {}
};
This gets tricky. A function may take a Father*
argument and call
FatherFoo()
on it. But if you pass a Child
instance, it is expected to
invoke Child
’s overridden method with the correct this
pointer. However,
the caller doesn’t know it’s really holding a Child
. It has a pointer to a
Child
’s offset where Father
’s layout is. Someone needs to offset this
,
but how is it done? What magic does the compiler perform to get this to work?
[Before we answer that, note that overriding one of Mother
’s methods is not
really tricky as the this
pointer is the same. Child
knows to read beyond
the Mother
vtable and expects the Child
methods to be right after that.]
Here’s the solution: the compiler creates a ‘thunk’ method that corrects this
and then calls the ‘real’ method. The address of the thunk method will sit
under Child
’s Father
vtable, while the ‘real’ method will be under
Child
’s vtable.
Here’s Child
’s vtable:
0x4008e8 <vtable for Child>: 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
0x4008f0 <vtable for Child+8>: 0x60 0x09 0x40 0x00 0x00 0x00 0x00 0x00
0x4008f8 <vtable for Child+16>: 0x00 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400900 <vtable for Child+24>: 0x10 0x08 0x40 0x00 0x00 0x00 0x00 0x00
0x400908 <vtable for Child+32>: 0xf8 0xff 0xff 0xff 0xff 0xff 0xff 0xff
0x400910 <vtable for Child+40>: 0x60 0x09 0x40 0x00 0x00 0x00 0x00 0x00
0x400918 <vtable for Child+48>: 0x20 0x08 0x40 0x00 0x00 0x00 0x00 0x00
Which means:
Address | Value | Meaning |
---|---|---|
0x4008e8 | 0 | top_offset (soon!) |
0x4008f0 | 0x400960 | typeinfo for Child |
0x4008f8 | 0x400800 | Mother::MotherFoo() |
0x400900 | 0x400810 | Child::FatherFoo() |
0x400908 | -8 | top_offset |
0x400910 | 0x400960 | typeinfo for Child |
0x400918 | 0x400820 | non-virtual thunk to Child::FatherFoo() |
Explanation: as we saw earlier, Child
has 2 vtables - one used for Mother
and Child
, and the other for Father
. In Father
’s vtable, FatherFoo()
points to a thunk, while Child
’s vtable points directly to
Child::FatherFoo()
.
And what’s in this thunk, you ask?
(gdb) disas /m 0x400820, 0x400850
Dump of assembler code from 0x400820 to 0x400850:
15 void FatherFoo() override {}
0x0000000000400820 <non-virtual thunk to Child::FatherFoo()+0>: push %rbp
0x0000000000400821 <non-virtual thunk to Child::FatherFoo()+1>: mov %rsp,%rbp
0x0000000000400824 <non-virtual thunk to Child::FatherFoo()+4>: sub $0x10,%rsp
0x0000000000400828 <non-virtual thunk to Child::FatherFoo()+8>: mov %rdi,-0x8(%rbp)
0x000000000040082c <non-virtual thunk to Child::FatherFoo()+12>: mov -0x8(%rbp),%rdi
0x0000000000400830 <non-virtual thunk to Child::FatherFoo()+16>: add $0xfffffffffffffff8,%rdi
0x0000000000400837 <non-virtual thunk to Child::FatherFoo()+23>: callq 0x400810 <Child::FatherFoo()>
0x000000000040083c <non-virtual thunk to Child::FatherFoo()+28>: add $0x10,%rsp
0x0000000000400840 <non-virtual thunk to Child::FatherFoo()+32>: pop %rbp
0x0000000000400841 <non-virtual thunk to Child::FatherFoo()+33>: retq
0x0000000000400842: nopw %cs:0x0(%rax,%rax,1)
0x000000000040084c: nopl 0x0(%rax)
Like we discussed - offsetting this
and calling FatherFoo()
. And by how
much should we offset this
to get Child? top_offset
!
[Please note that I personally think that the name non-virtual thunk
is
extremely confusing as this is the entry in the virtual table to the virtual
function. I’m not sure what’s not virtual about it, but that’s just my
opinion.]
Stay tuned for Part 3 - Virtual inheritance - where things get even funkier.