Memory alignment bugs, old and new! (fun with Havok)

Writing low level code (assembly) on a system with an .. interesting architecture meant that I soon experienced the joy of memory alignment bugs. These typically present themselves in a completely mysterious manner to the uninitiated. A perfectly functioning piece of code runs happily, then you make a simple non-impacting change somewhere else in the code, possibly even in a section of code that isn’t even being executed, and BOOM! either weird behaviour, or an outright crash!

The first few times I experienced these kinds of bugs it was pretty harrowing, as the changes I was making simply didn’t add up to the output I was seeing. It was almost taking me back to the dark days of computers working on voodoo and black magic! Thankfully, I eventually became able to spot these kinds of errors for what they were, simply alignment errors!

An alignment error in code is where an instruction resides in memory incorrectly. A simple example is for the 16bit processor the Motorola 68000, all instructions must be even aligned. That is, all instructions must start on an even memory address. Addresses of 4, 8, 6, 100, 4096 are all even, and hence fine for an instruction, 5, 9, 7, 101, 4097 however are not. In the example of this CPU you will get a nice helpful address error from the CPU, which helps you realise what is wrong and start working to fix it.

This is quite a well understood bit of low level programming knowledge, so why mention it here? Well, I am currently spending my free time playing with some PC development in C++, using some of the wealth of freely available tech out there to create something, in this somewhat higher level language I have once again fallen foul of the alignment error!

Not something I would expect in a compiled language, I am quite new to using C++ and especially with this level of complexity too, so the seemingly random crashes and exception errors completely caught me off guard. I was feeling quite safe, thinking that the compiler would simply “do the right thing(tm)” and all the woes of low level assembly programming would be behind me, but no! It seems the alignment error persists.

In this case it has occurred whilst working with the rather amazing Havok Physics engine (a free binary only version with limited license is available thanks to Intel here). Naturally processing complex physics maths to the scale that Havok does requires a fair amount of CPU, and as such they have optimised their engine over time. This has lead to their use of their own custom memory management code and lots of clever ways to optimise memory and CPU time, some of which requires specific memory alignment!!!

I am not one for reading manuals.. I like to jump in and learn by playing with tech, unfortunately for me, there is a LOT to learn and I have managed to through what must be pure luck achieve some exciting results quickly. As I have refined this knowledge however my luck buffer has clearly depleted and I have started hitting issues.. so in typical man style.. now that smoke is pouring out, I am consulting the manual :)

Frankly, I’d recommend spending a good chunk of time reading through the Havok user manual, there is a wealth of knowledge and examples in it, and it actually (for me at least) makes for interesting reading!

For anyone else out there battling with their compiler, wondering why their code sometimes works and sometimes doesn’t sometimes even running the binary 2-3 times will yield different results and errors.. You may have an alignment issue! I found this article on the Intel forum enlightening, and then a bit of a read of the “Memory” section of the manual.

I have now adjusted my base class for my physics object to inherit from the hkReferencedObject class and so far things seem better. If your class cannot inherit from this then the macro’s HK_DECLARE_NONVIRTUAL_CLASS_ALLOCATOR and HK_DECLARE_CLASS_ALLOCATOR may be your salvation. These will ensure that the Havok memory allocator and correct alignment are used when a new object is constructed (hopefully :) )

In my case I was seeing errors like “Access violation reading location 0xffffffff”

So, cycling

Yeah, I like cycling now.. quite a bit :)

One thing I have noticed a few times on my commutes to and from work is how slow cars actually are. Sure when you are cruising along at 30-40 mph you are going a lot faster than your average cycle commuter.. but.. then comes the traffic…

Tonight for example I noticed a van at a junction heading the same way as myself. It pulled away and disappeared as I merrily peddled away at my usual ~20MPH pace… but then at the next junction/lights I drew parallel to it, this happened several times along probably a 5 mile stretch of road until we finally went our separate ways.

I am a good cyclist, I stop at all red lights (even empty pedestrian crossings) and stay behind the white line at lights (I like not being fined or squished). So it is logical to say that whilst the van could easily hit speeds of 30MPH+ our overall average speed across that distance was the same, but I probably found it less frustrating as I quite enjoy my rides. Also cheaper, less pollution AND (motorists pay attention) I took up a lot less room on the road!

Now if only more people cycled to work, the roads would be a lot more pleasant and we’d probably all be a lot healthier… (and able to eat LOTS of CAKE! :D )

Curse you CPAN! (LIBXML2_2.6.0 not defined)

If you have seen the following error, possibly starting up Apache2 on your Linux box and you can’t for the life of you figure out why.. it MIGHT just be CPAN as the ultimate culprit. There is a good chance you may have a conflicting version of libxml2 in /usr/local/lib probably with a bunch of other libs too. As I didn’t install these myself, the only tool I can think of that will have done will be CPAN when a package has been installed via it.

Thankfully the fix is simple, remove the libs from /usr/local/lib that are conflicting, and don’t use CPAN :)

This may break your Perl module it installed, but hopefully will restore the usability of your system. My recommendation would be to only use packaged versions of your Perl modules to ensure system integrity, save you some headaches later in life.

Hectic!

Wow, life suddenly got VERY hectic for me!

Crazy busy at work with all mannor of projects, all far too secret for me to divulge here :)  But those have to come second to what is really taking up my neural ticks in thought process..  My oh so lovely girlfriend… yeah yeah, I know, soppy blergh etc.. but this is a tad different!

How so? well (as I am sure anyone who has suffered being in my presence for the last few months can attest), she is representing Great Britain at the Paralympics!  So proud of my good lady am I! Hope you will join me in cheering her on, I’ll be there myself in person cheering away, for those who are not, I believe all the coverage is going to be on Channel 4.  Her events are Track cycling, Tandem B/Vi 1KM TT on 31st of August at 9:30 and the 3KM Pursuit on the 2nd of September at 9:30 with the final on the same day at 14:00.

Her athlete profile on Channel 4 is here

So proud, I am a very lucky boy!

Shower coding!!

Seems I have some of my best coding breakthroughs in the shower.  Hot on the heals of my recent sound engine release I had a think in the shower today about further optimisations I can probably build into it.  I have quite a significant rewrite planned, but a low level component of this rewrite worked on using 32bit cache buffers for sample data (recent release uses just 16bit ones).

As the samples used are all 8 bits, a 16bit word can obvioulsy hold 2 samples, thus half the number of time the DSP has to talk to main RAM, reading the sample data once from main RAM and then the next subsequent read from it’s own cache.  As the resampling will in some cases simply need the same sample n number of times, this saves a chunk of bus time.  The reason I chose 16bit originally is the DSP only has a 16bit data bus connection to the main RAM (more of an IO port style connection IIRC).

Thinking about it, I am assuming that 1x 32bit read will take less system resource/bus time than 2x 16bit reads seperated by several ticks of other instructions.  Especially as whilst the DSP is making its 16bit reads, nothing else can be using the bus and will need to have been paused.  I am most impressed with how little additional code was needed to support this change, the current code has to translate the requested sample address to determine if it has it in cache, then retrieve just the single byte from the cache that is needed, it all came together very nicely.

I imagine the overall gain will be quite minimal, but every little helps, I have sent the code off to be tested in a real world environment, hopefully there will be some positive results :)

Next up is a complete rewrite of the render subsystem for the sound engine to effectively invert the way the cache works and hopefully keep the DSP off the bus even more!

SoundEngine new release

Well as my very lovely lady is away, and I seem to have irked my hamstring somehow AND the weather here is utter rubbish at the moment I thought I would crack on with some improvements to my Sound Engine.

Last few days have been spent faffing with the Vibrato effect, this effect takes two parameters, one sets the frequency and the other the amplitude of a pitch distortion of a playing sound.  It always gives me a headache to code, and as my OCD tends to want perfection that irks me further.  The amplitude is based on 8ths of a semitone, of course as these are calculated by a non-linear expression it requires some look up action.. but then if you are coming off the end of a slide, there is a good chance that the current playback period may not actually exist in your lookup table.. so I engineered a less than accurate, but hopefully good enough work around.

So I compute a percentage of the current period which is approximetly the same size as an 8th of a semitone between the current note and it’s neighbour, so whilst not 100% accurate, it works irrespective of the actual playback period and gives similar audiable effects.

Overall I am quite pleased with my solution, that combined with the other effects I have added support for and the improved timing code so that non 50Hz timing based modules play correctly make quite a lot of modules that sounded a bit iffy now sound bob-on..

Of course this has meant I have had to rethink a lot of previous ideas, had new ideas and subsequently need to re-write a fair chunk of the code before I progress further.. or I will claw my own eyes out, but its all going nicely and I have my most hated effect completed, so the rest should be easy now… should be… :D

Anyone who is interested you can download it from the website here, complete with a changelog

Hooked… I think

Seems that the ride on Monday has had a lasting effect on me.. I enjoyed it so much I have a hankering for more of the same.  To fuel my newfound addiction I went and looked on bike-events.co.uk for more fun things to do… So I have signed up for the Manchester to Blackpool ride, and have intentions of also signing up for the Manchester to Chester and Manchester 100 rides too.. Just need to decide which (if any) charity I am going to do fund rasing for…

Whilst these are steps in the right direction, they lacked a certain something, I wanted something a bit more.  So in an attempt to find this I had a mooch around on the British Cycling website.. and have now applied for a racing License! hoping to take part in some national races and see how I do in the rankings :)

My plan is to go to see a a race on Tuesday after work, and if possible to take part in it myself the following week :)  I’ll post here following that I think.

Exciting times!

52 miles….

Today was the day of the Great Manchester Cycle.  13 miles of closed roads forming a loop that starts at the Eithad stadium along the Mancunian way (A57M), through Media City and past Old Trafford before heading back along the same route.

The event has 3 flavours, 13 miles, 26 miles or 52 miles… of course, I went for the 52 mile option :) this also brought an additional caveat in that I was required to maintain a speed of 18MPH or faster on average to complete the course in time!  I thought what the hell and signed up anyway.

The 52 mile ride started this morning, setting off at 8:00AM with rider assembly at 7:30AM, of course I got a whole 5 and a bit hours sleep before hand :/ and probably didn’t eat the best of foods, but meh don’t want to make things easy for myself now do I!  I was hoping to wear just the event Jersey and my shorts, but despite the sun it was a little cold so I ended up needing my cycling jacket too.. so glad I did that!  My toes were frozen and I never really got too sweaty, not too bad a temp for the ride really (BBC Weather stated around 8 degrees C to about 12 degrees, my Garmin was saying about 13 degrees).

Once we set off the ride out was really rather enjoyable, this being my 1st ride of this type and my 1st real experience of bunches and chains.  WOW! how much fun are they!  Riding along the A57M itself is fun, but when you are doing 30MPH easily, slipstreaming off another rider or two.. awesome!  Which reminds me of another cool thing, the noise.. near silence (other than some chatting).. just the faint hum of wheels on tarmac.  I was quite pleased that I managed to get right up to the lead bunch at the start quickly and without having to push myself!  It didn’t last :( after the pinch point at Media City which almost has the riders single file onto a foot bridge I just lost the bunch and ended up riding solo.. and into that bloody wind, and up hill.. :( :( :(  So this really sucked the life and speed out of me.

Thankfully after each lap I learnt a bit more and started looking for wheels to hang off of, actively chasing after a chain to sit in its slip stream.. I was quite pleased to realise I had someone slipstreaming behind me for a fair turn of the route, spurred me on a little to maintain the 25MPH I was doing!

Towards the end of the ride I was pretty tired, this was the fastest, non-stop, constant peddling ride I have ever done, my Manchester to Blackpool rides of previous years have been slower, and involved stops at lights and feed stations.  Today I stopped after the 1st lap for a loo break and that was it!  I didn’t free wheel too much just kept spinning away.  Another boost to my speed and drive was switching the screen on my Garmin to show the current time and my ride time.. until that point I had been concentrating on my average speed more than anything, so had no idea of the time.  Switching to see the time revealed it was a lot earlier than I thought and indicated that I was doing better than I had thought! this really helped spur me on!

A brill day out, lots of fun, I will doing another of this I think.  Which only leaves my stats really…

I completed the full ride in 2:42:36 with splits of 42:08, 40:08, 40:39 and last lap of 39:41  (found a good chain that got me most of the way back :D )

and the stats from my Garmin via Endomondo are:

And my best distances were:

Now, I think I’ll go have a bit of a rest :)

Time flies…

When you are writing code!

Got stuck into working on some long overdue tweaks to my mod player routines these last few days, finetune and BPM based timing.  Both of which I have now solved and implemented (although some additional tidying will be needed in the future!).  Unfortunetly both require lookup tables at the moment due to the chunk of maths or brute force needed to compute the tables, it works, and if I figure a nicer solution I can always add it later.

Both of the additions only required less than 10 lines of RISC code each to make use of their LUTs, a little bit of buffer space and a bit of main RAM, and bish bash bosh things sound that little bit better.

Still a few more effects I want to get finished off before I roll out this release of the SoundEngine however, but it is being worked on!  I have a chunk of time coming up where I should be doing a fair amount of work on it, so who knows, could be a new SE release in June! watch this space!.. well watch the U-235 website really :D

WHEEE!

I added another vehicle to the list of those I have managed to get ‘the backend out’ on/in :)  Especially as this will be the second one within a week!

Thought it might be fun to blog all the ones I have had go sideways (a little bit) when they were supposed to go straight.  So in no real order they are:

  • Front wheel drive car with use of handbrake..
  • Front wheel drive car (no use of handbrake!)
  • Rear wheel drive car
  • Mountain bike (YAY FUNS!)
  • Road bike
  • Road Tandem bike ( :D )

I think the most scary was the tandem, there was a brief moment of ‘oh crap, this is going over’ which was audibly the thoughts of the stoker from her reactions at the time too :D  It didn’t we were awesome, no it wasn’t intended :)

 

ARGH bloody RISC bugs!

I don’t have much hair, and with the aid of the Atari Jaguar RISC CPUs I am bound to have less with fun bugs like these.  I keep hitting this one, forgetting about it for a few mind numbing minutes/hours/days and then remember it and spend more minutes/hours/days resolving it.  So I thought I would scribble it here…

I am not 100% certain if this is entirely the RISC CPU bug or MADMAC being a bit pants at alignment, but it is possible to generate ‘fun’ alignment issues in your code by simply adding or removing a nop (or other similar 16bit only instruction).  The instruction doesn’t need to be called even! that’s how much fun this is!

From what I can tell it seems that jump instructions are particularly fussy about where they are jumping to, so it is possible that by adding/removing a 16bit instruction you will move a jump destination into one of these ‘un-desired’ addresses and hey presto your RISC code suddenly stops working, or does something weird.  Even though you haven’t changed anything that should cause such behaviour.

So if you are playing with Jaguar RISC code and using jumps and sometimes it randomly seems to stop working but then work again, this could be the cause.  How do you find which jump is being affected? trial and error is my best method, or add a nop just after the code you modified (doesn’t need to be the executed code block), one at a time until your code magically starts working again :/

One day I’ll fully track this down and come up with some fix/work around.. probably :D  Until then, keep hacking!

Improvements

Popped out for an ‘easy’ ride with my lady yesterday.  I say ‘easy’ that’s her deffinition, her easy is more like my moderate, but I am improving at least.  Decided to try the Fallowfield loop out, to see how tandem friendly it would be, only looking to ride for an hour and half.  This soon highlighted my own improvements to me, last time I rode this was a couple of months ago and riding its length and back again was pushing my abilities.  However this time, I found the 10 miles out rather easy, the route however was less tandem friendly than I hoped.  Stopping for lights on main roads is a lot more preferable than having to get off the bike to hoist it over gates etc.  Also playing in traffic (even on a tandem) whilst playing the ‘dodge the pothole’ game is a lot more fun and less risky than dodging kids/dogs and muppets wandering along cycle paths :)

After an hour and half of cycling although a bit tired I still had some left in me, so I am certainly improving :)  I think I am going to have to do a fair bit more on my solo bike to get ready for the Great Manchester Cycle (I have entered myself for the 52 mile time limited version!!).. I think I few 50 mile routes will need to be completed.. I think it should be do-able however.  Will be interesting to see how I do against other riders.

Time flies!

Apparently! well it must, they have clocks on aeroplanes after all…

anyway, I really should try to scribble here more, so much has happened of late in so many different ways in my life and I didn’t blog any of it!! (although some of that would be a conscious decision to not blog it :) )

Coding wise, my work on the Sound Engine has progressed beyond my hopes! it isn’t complete and I have so many exciting ideas to add to it still, but there is a beta release out there!  Most exciting of all is that people are using it!  It’s an awesome buzz to see the logo for a project you have worked on stuck in someone elses project, reading reviews where people are complimenting the music of that project and thinking “My code is playing that!” brilliant.

Life wise has also been amazing, I have met an amazing woman (literally!)  and things are going better than I could have hoped in that regard also!  One of the many positives she has had on my life is rekindling my interest in cycling, so much so I am now cycling to work (not enough, but getting there), twice this week, would have been three times, except I was feeling pretty lazy this morning :) (and have a lot to do tonight). As well as cycling, I have started to take Skiing lessons, hoping to make use of my new skills with my new lady later in this year or early next year :)

So, in super summary, interesting stuff on the Jag, amazing woman in my life, lost a load of weight (20+Kg!), getting fitter, learning to ski, oh and embedded electronics and system development fun!  See a lot has happened..

Now I just need to try and manage all this and get some sleep in too!

Brocade fabric switch error (AD VF conflict)

Just had this error on an ISL link between two switches, my initial Googling didn’t yield too much of use so I thought I may as well blog it and the solution (I used) here, hopefully be of help to others.

The scenario: Connecting a new Brocade 5100 switch with no configuration to an existing Brocade 5000 based fabric that uses Administrative Domains (AD).  Both the new 5100 and the 5000 are running the same release of FabOS, the 5100 has nothing but the barest of bare configurations (it’s IP address and authentication credentials).  With the ports connected, a switch show on the switches yields the following for the ISL port between them:

5100

LS Attributes:    [FID: 128, Base Switch: No, Default Switch: Yes, Address Mode 0]

Index Port Address Media Speed State     Proto
==============================================
  0   0   0a0000   id    N4   Online      FC  LS E-Port  segmented,(AD VF conflict)

5000

Address Mode:    0

Index Port Address Media Speed State     Proto
==============================================
  0   0   030000   id    N4   Online      FC  LS E-Port  segmented,(AD conflict)

I have included the last line of the information block as it gives a clue, the 5100 has additional lines here and more features, FID relates to Virtual Fabrics, a feature not supported on the 5000 in this case (I suspect not supported at all on the model but am not 100% sure of this).

The big clue for me is the “(AD VF conflict)” a bit of Googling revealed that it is not permissible to use Virtual Fabrics and Administrative Domains, I believe VF is the new AD, and hence mutually exclusive.  So the fix?

Simple, disable one or the other.  Now as the established fabric in this case is utilising AD, I am not going to disable that (as much as I would love to bin it :) ) so disabling the VF on the new 5100 is the order of the day.  This is thankfully simple enough to do.

Use the command

fosconfig --show

to list all configured VFs and remove all but the default ones.  If you haven’t configured any, you won’t have any.

fosconfig --disable vf

will prompt you to confirm you wish to disable VF, this will require a reboot of the switch you are on, and a full reboot it is too, it will stop passing frames.  Once the reboot has completed, the switch should come back up and happily merge with your existing fabric.

Hopefully this will be of use to someone else, and as with anything I write on here, use this information at your own risk, I am not going to accept responsibility if you wipe out your companies fabric, or set fire to your gran doing this.

So, what happened to that sound engine?

Oops! yeah I kind of forgot to post anything here didn’t I!  My bad.  Well it certainly didn’t stop, in-fact several versions of it have been released and the current revision is well underway.  It has been adopted as the Sound Engine of a rapid game development engine (RAPTOR) written by the group Reboot (which for me is a great honor!), and their continued support and help with its development has pushed it along in leaps and bounds.

The core of the engine has been fully converted to the RISC based DSP on the Jaguar, with only support setup functions being called by the 68000.  Improved code has reduced it’s size and increased it’s accuracy, as well as removing the need for any look-up tables at this time (I have a suspicion I may not be able to maintain this for full module playback compatibility.. time will tell).

The latest release being worked on (0.18) has a completely rewritten sound rendering core, reducing code size whilst increasing performance and efficiency and also tolerance for bus latency!  Working in systems as ‘limited’ as these gives you a far greater appreciation for the finite capabilities of the machine.  Modern machines have so much slack available to them in terms of bus speed and memory buffers that these considerations just don’t enter in.  In the case of the Atari Jaguar the single memory bus shared between 5 processors running at 25MHz, you have to take into account that what you ask for from main memory may not arrive for quite awhile, adding buffering to absorb these delays is an absolute requirement, unless you want horrible distorted sound.  Of course these buffers are limited to only a few KB as you need this limited RAM for your program code and variables.

So, progressing and with plenty of ideas in the pipeline.

New year, get on that bike

Not a new years resolutions, more an organic desire driven by the number of cycling related friends I have and watching the Revolution highlights on Monday.  Revolution wasn’t as bad as I thought but still not my thing, to give an example of what is more appealing to me (I don’t typically do spectator sports) I found two YouTube links Street Descent in Brazil, and Street descent in Chillie  (the Chillie one is my fave).  Neither of these are my riding style or skill level I must add! :D

But watching those clips stirred the long slumbering desire to ride my bike again!  As she has been unused so much since having my original bike stolen in 2006, which really killed my buzz, I have decided to give her a full service and shock maintenance.  Extra handy as Evans are only across the road and the ONLY authorized Scott shock maintainer in the UK (license to print money? at £95 a shock, I think so :( ).

Last night I started the bike service of 2012 and have removed the shocks ready to take over to Evans tonight after work, during the 2 weeks that should take before I get them back I am hoping to have a good go at cleaning up the rest of the bike, ready for riding in 2012 :D  Going to be uploading pics to the gallery here for anyone interested in pictures of a bike being tinkered with and cleaned :D

I am mostly pleased with my alternative solution to pad spreaders (which I forgot to buy when in Evans).. cone spanners & cable ties! :)

SoundEngine.. complete.. mostly!

It’s been quite a while since my last post and so much has been learnt and changed since then.  So much so that I have now completed the first release of my sound engine.  100% RISC assembly too! it has even recieved some very very nice compliments from my peers in the community, so I am pretty pleased with myself :)

Anyone interested in having a look (fully documentated too! with example code!) it can be downloaded here

100% on test Mod!

At last! 100% of the way through my chosen starting module (Theme tune to Alf by Trash).  Although I am a little sick of hearing it now oddly enough it mostly sounds (to me) as it should do.  Other ears have heard some tuning issues, but only in some places and I suspect this may be due to the rather speedy coding of some of the effects.

As the code to read the actual pattern data was so bodged and crufty and incomplete (anything that used more than 15 samples was never going to work), I decided to completely rip it out and rewrite, but adding in hooks for various bits along the way and knowning now what I didn’t know when I 1st wrote it.

Took me longer than I planned originally, but the final result was certainly worth it!  Adding effects to this new build is super simple.  There are 15 primary effects, with a further 15 sub effects.  For now I am concentrating on the primaries.  I therefor have a list of Addresses of the code to produce the desired effect.  To select the correct effect code the effect number is added as an offset to the address of this list, the program counter is then jumped to the address stored there (not before pushing the desired return address on the stack for the RTS at the end of the effect) and thats it.  When I come to write a new chunk of effect code I simply write it and then update the list of pointers.  Jobs done.

I still have around another 21 effects to code, I am quite pleased with the pitch slide effects I have coded tonight despite me being suspicious that these may be the cause of the out of tune behaviour and also being incredibly simple :D

To further the work on the player I have a more complex effect rich module to work through now, one of my old favourites back in the day.  It has already identified and helped me resolve a few bugs I wasn’t aware of.  Really enjoying this project :)

Moar success!

I cannot believe how well this is going!  not only did my plan for volume adjustment work perfectly, I managed to write the whole thing in RISC on the DSP without any screwups!  Even managing to craft a procedure call with appropriate return 1st time!  Rather pleased with myself.

Some more cracks are starting to show in the less well planned code :) it is like building a wall ontop of a badly done foundation, the more you build the more the foundation crumbles and needs bodging to keep up.  The re-write to RISC will be fresh at least, all though lovely extra registers to play with should make this run even sweeter.

In addition to my volume success I finally twigged on the period values being used within the tracker files, and believe I have actually tuned this to play at the correct pitch now, making it sound even better!  I tried a different tune and this highlighted a load more bugs I need to fix (ignoring the additional effects it uses for now), and yet more crumbling code as well as illustrating a possible need for multiple format detection.

Going to finish up early tonight as I want more sleep and I also need to build a VTES deck for playtime tomorrow :)