Multiple Inheritance & Virtual Inheritance in Visual C++

Inheritance, and by extension, polymorphism is one of the more powerful tools of object oriented programming. Most people take for granted the implementation of inheritance however sometimes it is important to know what happens behind the scenes so that we are conscious of the cost and implication of our design choices.

Regular Inheritance – AKA Non-Virtual Inheritance

Whenever inheritance is involved, there must always be a base class which does not inherit from any other class, this class is called a primary class. Whenever a class has virtual functions, then that class must implicitly have an additional pointer called the vfptr, even if the class is a primary class. This is easy to check, you can write a class which has one or more virtual functions in it’s definition and no data members, and then running the sizeof function on the class will reveal that it is the size of one pointer. This additional pointer in question is used to point to the virtual function table. The virtual function table is a table with pointers to the correct versions of virtual functions for a given class that either has or inherits virtual functions and effectively gives us the possibility of polymorphism in C++. Each time you create a class that contains virtual functions, or you inherit virtual functions, the compiler creates a unique virtual table for that class. If this doesnt sound familiar, I recommend reading further about virtual tables. The following link might help:

https://www.learncpp.com/cpp-tutorial/125-the-virtual-table/

One thing to note with regular inheritance is, it doesnt matter how deep the inheritance is, the overhead is always the size of one pointer. This is because if C inherits from B, and B inherits from A, all three of these classes can share the same virtual function table. A vfptr at the top of the memory layout is sufficient to provide all the indirection that A, B and C need to reach the correct virtual function table. To that end, also notice that the memory address of A, B and C is all the same, namely the “this” pointer points to the single shared vfptr.

 

Multiple Inheritance

While typically advised against, there are still situations where multiple inheritance arises as a solution to a problem. Let us consider the case where we have primary classes A and B, and then C inherits from both non-virtually.

If A has any virtual functions, then we know it must have a vfptr, and the same applies for class B. However there is no way that these vfptrs can be merged (as in the case of single non-virtual inheritance) because B does not inherit from A. The “this” pointer for the instance of C will point to either the start of the A subobject or the start of the B subobject, where the start of either subobject is its vfptr (which vfptr C’s pointer will point to depends on the compiler, I believe Visual C++ handles inheritance left to right). We can verify that there are two vfptrs by doing sizeof(C), and indeed the size comes to that of two pointers assuming no data members.

It is worth noting here that, unlike in single non-virtual inheritance, typecasting C to either A or B might actually change the memory address the pointer points to. To this end, using reinterpret_cast<A*>(pInstanceOfC) or reinterpret_cast<B*>(pInstanceOfC) is a very bad idea. This is because reinterpret_cast does not modify its argument in anyway, it simply states “interpret the bits of my argument as the type of template” and nothing more. However, one of either A or B does not start at the same address of C, so this will go south very fast.

Now an interesting situation with multiple inheritance is the classic diamond problem. Where we have the following:

Pic2

If we follow the non-virtual inheritance method above, then we will have two instances of A. D will encapsulate an instance of B and C, each of which will encapsulate their own instance of A. This may be the behavior you want, but for most this causes ambiguity. If I want to typecast D to A, which A are we talking about? If I want to access a member variable of A, which A are we talking about? There are after all two distinct memory addresses for A because there are two instances of A. If you really wanted to work with two instances of A, then you can resolve these issues by first typecasting D to either B or C, and then typecasting that to A. This effectively allows you to specify which A you are talking about. However, what if we only want a single instance of A? Perhaps A is just an interface with no data members and only pure virtual functions (this is the case for COM objects I discuss in a different blog post). This is where virtual inheritance comes in.

 

Virtual Inheritance

Virtual inheritance enables us to solve the diamond problem in multiple inheritance by guaranteeing that there is only one instance of a base class that is being virtually inherited. In code it might look something like this:

Pic3

In this case, both B and C virtually inherit A, and now typecasting D into A is no longer ambiguous because virtual inheritance states that there will only be one instance of A. Seems like virtual inheritance is a good fix for our problems! Well, as with all things, there are always a few gotchas. The above example is one of the better use cases for virtual inheritance, because A is an interface with no data members and thus it is straight forward to work with. A less straight forward situation might be the following:

Pic4

If we were to construct D with the values 1 and 2 as constructor arguments, then what value does m_Val in the single instance of A construct to? A decent guess is “either 1 or 2”, however that is actually not the case. The answer is 999. This seems contrary to what we might expect.

Consider for a moment, how virtual inheritance might work. Both classes B and C virtually inherit A, and thus they share a common instance of A, but who constructs A and who decides where A is placed in memory? While in non-virtual inheritance B and C would be responsible for the construction of A, because they own A, that is no longer the case. In fact, the object that now owns A is the most derived class D. The exact layout of memory will depend on the compiler, the patent for Visual C++’s implementation of inheritance suggests that the resulting memory layout will look like the following:

Pic5

Since D is now the owner of the instance of A, it is up to D to construct A. However in the code sample we wrote, D’s constructor never calls the constructor for A, and so the default constructor is called. This assigns the value of 999 to m_Val. Once A has been constructed, it cannot be constructed again, thus B and C never actually call the constructor for A and so val1 and val2 are ignored. However, while B and C cannot reconstruct A, they can modify A, and so we can write the following code:

Pic6

Now, if I print m_Val after constructing D, m_Val will always be 11. A is default constructed by D setting m_Val to 999, then the constructor for B is called setting m_Val to 10, then the constructor for C is called, setting m_Val to 11. This matches the order in which D inherits from B then C, swapping the order of inheritance would cause m_Val to be 10. This means that writing a constructor that uses a member initializer list will have different behavior than writing a constructor that assigns values to members in the constructor body. You can see now why virtual inheritance might get messy. This behavior is not intuitive and might be overlooked when writing code, leading to bugs later on.

 

Another interesting thing worth looking at is how do B and C know where A is in memory. After all casting D to B must give us a fully functional B, and for that to work, B must know where to find A. However, the position of A depends on the most derived type and thus varies. To resolve this, Visual C++ implements a virtual base table pointer (vbptr from here on). The vbptr points to virtual table that stores base offset from the address of B to the address of A. Thus if the starting address is hypothetically 12 bytes away from the starting address of A, then vbptr points to a virtual table that stores the base offset value “12”. The base offset value will be different for B and C because their starting addresses are different. The virtual table for base offset is different table than the virtual table for virtual functions at least in the implementation of Visual C++ (could vary for other compilers, they might merge the tables). This means that every class which virtually inherits from another class must store a vbptr to find its virtual base table. We can check by running sizeof(D) on the above code, and we confirm that the size of D is the size of an integer + the the size of two pointers (one vbptr for B and one vbptr for C).

Similar to how virtual functions introduce some small overhead due to the extra level of indirection required to call those functions, virtual inheritance introduces a small overhead due to the extra level of indirection required for it to access its base class.

 

We looked at the size footprint of vbptrs in virtual inheritance, but what about for vfptrs? In the case where D inherits from B and C, and B and C inherit non-virtually from A, then we only need at most two vfptrs. This is because if A declares some virtual functions, and then B and C both declare some new virtual functions, then with non-virtual inheritance we can merge the virtual functions of B with A, and the virtual functions of C with A to create two virtual function tables. To point to those two virtual function tables needs two vfptrs, and we can confirm this with the sizeof function.

However, when using virtual inheritance, two virtual tables is not sufficient. If we inherit virtually from a base class, we cannot merge the virtual function tables of the derived class and the virtual base. The reason for this is, the moment you virtually inherit from a base class, you are fundamentally declaring that there might exist other classes which also virtually inherit from the same base class in a given dynamic type, and so more than one class will inherit from the same instance of the virtual base. This should make sense, at the end of the day, if no other class is going to inherit virtually from your virtual base, then why have a virtual base at all? If only B will inherit from a given instance of A, then you may as well just use non-virtual inheritance, you would only use virtual inheritance if there MIGHT be another class which inherits from the same instance of A. In this case for the dynamic type D, there is indeed an instance of class C which virtually inherits from the same instance of A as B did, if there wasn’t, B could just non-virtually inherit from A.

Notice how above I said there there “MIGHT” be another class which inherits from the same instance of A. The issue is that B doesn’t know for a fact that there are, and even if it does, it doesnt know which other classes. All these factors depend on the dynamic type, and cannot be known at compile time. If we follow the optimization of non-virtual inheritance, and merge the new virtual functions that B introduces with the virtual functions of A to form a new virtual function table, then it stands to reason that we must do the same with C. Now suddenly we have three classes worth of virtual functions in one virtual table. If we typecast D to type C, then call one of C’s functions, how will C know where to find its own function in the virtual function table? C doesnt know that the dynamic type D also has an instance of B that added virtual functions into the same virtual function table. As such it will try to index into the virtual function table where it believes its own functions are, but instead there are B’s functions…. or maybe C’s functions are actually there, that depends on which order the compiler places the functions in the virtual function table. Clearly this isnt going to work out. It is for this reason that if a class derives virtually from a base while introducing new virtual functions of its own, it makes its own virtual function table with the new virtual functions, rather than merging with the virtual base class’ virtual function table.

Going back to our example of primary class A, B virtually inheriting from A, C virtually inheriting from A, and D inheriting non-vritually from B and C. If A introduces some virtual functions, it makes a virtual table and so has its own vfptr. If B doesnt introduce new virtual functions, but only overrides virtual functions from A, then we dont need a new vfptr, instead the vfptr of A just points to a different virtual function table now. If C introduces some new virtual functions, it cannot merge with the virtual function table of A, so a new one is made, and C has its own vfptr pointing to this new table. Finally if D doesnt introduce new virtual functions of its own, then nothing more is needed. However even if D introduces new virtual functions, an extra vfptr is not needed, instead both of the previous virtual function tables are extended to support D’s virtual functions creating new virtual function tables, and the two vfptrs point to those instead.

The total overhead here is two pointers worth for the two vfptrs. If however B were to be modified to add new virtual functions of its own, then a new virtual function table is made that cannot be merged with A’s. We would need a pointer to this virtual function table, and so B now has its own vfptr as well. In this case the overhead becomes three pointers worth.

To conclude our example, if we have the following class:

Pic7

A will have its own vfptr (which will point to a virtual function table that points to the DoNothingA() that B defines). B and C will both have their own vfptr, each pointing to a virtual function table that point to DoNothingB, DoNothingD and DoNothingC, DoNothingD respectively. B and C will also have a vbptr to help them locate A, as the position of A depends on the most derived type (in this case D). The total memory footprint of all this indirection will be five pointers worth.

 

If the above is still somewhat confusing, I recommend a quick look at Figure 15 from Microsoft’s patent on how their compile handles inheritance. There isnt multiple inheritance, but the virtual inheritance concepts are still highlighted.

Pic8

Here we have A1 as the primary class, and it has two functions fa11 and fa12. B3 inherits virtually from A1 and so has a vbpt. B3 overrides A1::fa11 and also introduces its own virtual functions fb31 and fb32 which mean that it must have its own vfptr. Finally, C3 inherits non-virtually from B3. It overrides B3::fb32, thus modifying the virtual function table that B3::vfptr points to, and it also introduces its own new virtual function fc31. Since C3 non-virtually inherits from B3, C3::fc31 can simply be appended to the virtual table of B3.

 

 

Resources:

https://patents.google.com/patent/US5754862

Good and bad sides of virtual inheritance in C++

Click to access MemoryLayoutMultipleInheritance.pdf

One thought on “Multiple Inheritance & Virtual Inheritance in Visual C++

Leave a comment