Memory alignment bugs, old and new! (fun with Havok)

Writing low level code (assembly) on a system with an .. interesting architecture meant that I soon experienced the joy of memory alignment bugs. These typically present themselves in a completely mysterious manner to the uninitiated. A perfectly functioning piece of code runs happily, then you make a simple non-impacting change somewhere else in the code, possibly even in a section of code that isn’t even being executed, and BOOM! either weird behaviour, or an outright crash!

The first few times I experienced these kinds of bugs it was pretty harrowing, as the changes I was making simply didn’t add up to the output I was seeing. It was almost taking me back to the dark days of computers working on voodoo and black magic! Thankfully, I eventually became able to spot these kinds of errors for what they were, simply alignment errors!

An alignment error in code is where an instruction resides in memory incorrectly. A simple example is for the 16bit processor the Motorola 68000, all instructions must be even aligned. That is, all instructions must start on an even memory address. Addresses of 4, 8, 6, 100, 4096 are all even, and hence fine for an instruction, 5, 9, 7, 101, 4097 however are not. In the example of this CPU you will get a nice helpful address error from the CPU, which helps you realise what is wrong and start working to fix it.

This is quite a well understood bit of low level programming knowledge, so why mention it here? Well, I am currently spending my free time playing with some PC development in C++, using some of the wealth of freely available tech out there to create something, in this somewhat higher level language I have once again fallen foul of the alignment error!

Not something I would expect in a compiled language, I am quite new to using C++ and especially with this level of complexity too, so the seemingly random crashes and exception errors completely caught me off guard. I was feeling quite safe, thinking that the compiler would simply “do the right thing(tm)” and all the woes of low level assembly programming would be behind me, but no! It seems the alignment error persists.

In this case it has occurred whilst working with the rather amazing Havok Physics engine (a free binary only version with limited license is available thanks to Intel here). Naturally processing complex physics maths to the scale that Havok does requires a fair amount of CPU, and as such they have optimised their engine over time. This has lead to their use of their own custom memory management code and lots of clever ways to optimise memory and CPU time, some of which requires specific memory alignment!!!

I am not one for reading manuals.. I like to jump in and learn by playing with tech, unfortunately for me, there is a LOT to learn and I have managed to through what must be pure luck achieve some exciting results quickly. As I have refined this knowledge however my luck buffer has clearly depleted and I have started hitting issues.. so in typical man style.. now that smoke is pouring out, I am consulting the manual 🙂

Frankly, I’d recommend spending a good chunk of time reading through the Havok user manual, there is a wealth of knowledge and examples in it, and it actually (for me at least) makes for interesting reading!

For anyone else out there battling with their compiler, wondering why their code sometimes works and sometimes doesn’t sometimes even running the binary 2-3 times will yield different results and errors.. You may have an alignment issue! I found this article on the Intel forum enlightening, and then a bit of a read of the “Memory” section of the manual.

I have now adjusted my base class for my physics object to inherit from the hkReferencedObject class and so far things seem better. If your class cannot inherit from this then the macro’s HK_DECLARE_NONVIRTUAL_CLASS_ALLOCATOR and HK_DECLARE_CLASS_ALLOCATOR may be your salvation. These will ensure that the Havok memory allocator and correct alignment are used when a new object is constructed (hopefully 🙂 )

In my case I was seeing errors like “Access violation reading location 0xffffffff”