"I actually think it's a feministic progress when girls start robbing icecream cars."
- Tove Fraurud, Ung Vänster

More D3D v-table hacking
Friday, August 6, 2010 | Permalink

The trick in the previous post is as implied by its title and noted in the comments not going to give much return on the time invested in terms of performance, although it might give you a deeper understanding of the underlying mechanisms and costs involved in calling virtual functions. So it's more like an academic exercise, but nonetheless a cool thing to play around with. It's also a prerequisite to the v-table hack I'm going to present in this blog entry, which on the other hand actually has some practical use.

Sometimes it's desirable to get a callback from the device when certain calls are made. Given that virtual calls are dispatched dynamically with the v-table it's possible to reroute the call somewhere else with a little hacking. Normally the v-table is located in read-only memory and thus will not change. And once a class has been instantiated its v-table pointer should not change for the life-time of the object, unless you have a nasty memory overwrite bug, or just hacking around like we're going to do. While we can't modify the v-table contents we can easily overwrite the v-table pointer and set up our own custom v-table. The original v-table is readable, so the first thing we want to do is just copy the original v-table to our new one.

void **v_table = *(void ***) ctx;
// Copy v-table
memcpy(new_vtable, v_table, sizeof(new_vtable));
// Replace v-table pointer
*(void **) ctx = new_vtable;

The new_vtable would be declared like this:

void *new_vtable[115];

This array will of course have to be kept for as long as you keep the device context object alive, so a good choice is to put it next to the device context pointer, or for quick hacks, just make it static. Don't put it on the stack, it won't work. The number 115 is the number of virtual functions in ID3D11DeviceContext. This number could be found by simply counting them in the header, including all virtual functions inherited from base classes. Or the easy way, just find the last function in the header, make a call to that and place a breakpoint on it. When you hit the breakpoint, just switch to disassembly view and check what offset into the v-table it's using. For ID3D11DeviceContext that's FinishCommandList(), which you'll find it's at offset 0x1C8, leading to an index of 0x1C8 / sizeof(void *) which is 114, hence the v-table needs 115 elements.

If you run this code everything should just run as usual with no noticable change since we replace the v-table with an identical one. Now let's say we want to count how many DrawIndexed() calls we're making every frame. We could then just override the DrawIndexed() function with our own substitute and point the corresponding v_table entry to it.

uint g_DrawCount = 0;
void STDMETHODCALLTYPE DrawIndexedCallback(ID3D11DeviceContext *ctx, UINT IndexCount, UINT StartIndexLocation, INT BaseVertexLocation)
{
    ++g_DrawCount;
    MyDrawIndexed(ctx, IndexCount, StartIndexLocation, BaseVertexLocation);
    //ctx->DrawIndexed(IndexCount, StartIndexLocation, BaseVertexLocation);
}

...

new_vtable[12] = &DrawIndexedCallback;

Now every call made to DrawIndexed() will end up in DrawIndexedCallback instead. After incrementing our counter it would be nice to call the original DrawIndexed() function as well, so that things are actually drawn too. Notice that I'm using the function pointer approach from the previous blog post to call the original function. Why am I not using the code that's commented away? It would intuitively seem like a good idea, but actually since we rerouted the call, that would now call right back to DrawIndexedCallback() instead of the original code, so we'll be recursing infinitely, leading to a stack overflow in Debug and a hang in Release builds where the compiler is smart enough to realize it doesn't need to fetch and push arguments back on the stack but can just jump to DrawIndexed() directly.

So why use this instead of just using proper API design and hide the gory details of D3D calls under a thin interface? Just make your own DrawIndexed function which does what this DrawIndexedCallback() does and simply call that instead, right? Absolutely. I would most certainly recommend that over this hack whenever possible. However, what if you're using a third party library which is making calls on the D3D device directly? Using this trick you can track or even alter the behavior of the third party library. In fact, during the development of Just Cause 2 there was one instance where I had to resort to this trick. Using PIX I had concluded that a third party library was probably not working optimally, but I wanted to be sure that I'm not making any false assumptions, the library could after all be smarter than you think. So I overrided a few D3D functions to alter the behavior and could conclude that I was in fact right. After this we of course communicated with the third party and they were very helpful and the problem was promptly resolved. Naturally none of the v-table overriding code was left in the shipping product or even entered into source control, but it was nice to have that backdoor to override the library and prove my theories right.

I recommend this trick strictly as a development hack. I would argue against shipping any such code in a final product. The main reason is that while it'll probably work, there are no guarantees. While the size of the v-table is known for ID3D11DeviceContext, it's not known for any deriving classes that the D3D runtime is not exposing directly. They could in fact have more virtual functions than the base class, and theorethically those could be called from any of it's functions it shares with ID3D11DeviceContext, and such a call would crash with our replaced v-table since it would just grab whatever random pointer happens to be in the memory after the custom v-table. In practice, I've had no such problems with D3D device contexts though, but there's nothing saying that it wouldn't break when Windows 8 or 9 ships with a brand new implementation of the runtime components. To play it "safe" one could of course just make the custom v-table bigger and copy more stuff into it. Make it sufficiently large, like 256, and it's probably safe for the entire lifetime of the DX11 API including all future OSes' runtime implementations, but personally I would avoid shipping such code to the public.

[ 5 comments | Last comment by r2d2Proton (2010-08-20 00:05:57) ]