I am working away on something I have always wanted to write, a tracker mod player. Many years ago I mentioned to Tyr that I wanted to write one that resided entirely within the DSP of the Jaguar, and this is my current plan. Of course rather than try and learn both the RISC CPU and how to write a tracker player I have decided to prototype the design on the 68K. This should be challenging enough, however I am very pleased with my initial progress, which I have left to fester for a while due to frustrations and having other things I need to do.
The main issue seems to be that I simply cannot squeeze sufficient time out of the 68K, which is crazy given it’s faster than the one in an Atari ST, and there are rather quick players on the ST. I for some reason have been struggling to push more than one channel past 8kHz which is pretty naff (although I am most chuffed in the general code, just it needs more work and optimisation)
This evening I decided to try out one of my ideas, I pondered that perhaps by ignoring the object processor if it had perhaps gone into a sulk mode and decided that if it wasn’t to be played with it would steal some of the bus bandwidth and sulk. So I gave it something to do.. no difference.. bollocks…
I decided to try some other things before I reach “stop coding o’clock” (if I code past that time, I simply will not sleep! 🙂 ). So I decided to look at how much time my code was taking on the CPU, as well as see if I could fire the interupt faster than I presently was doing. To achieve this I removed my playback routines from the interupt handler and simply replaced them with 2 instructions. One to change the background colour blue at the start and one to change it black at the end, with nothing between them. This produced some nice small blue lines on the screen, perfect, I can now SEE how long it takes the CPU to perform these 2 simple tasks (long than I thought too!).
For my next trick I re-introduced my playback code, but crippled it to simply just the portion of the routine which checks for new samples to play, and re-ran it. There was now a LOT of blue on the screen! although the amount of code was quite small. hmmm clearly the code is less than optimal time wise, naturally as I am coding a prototype I have not been as stringent as I would for a final.
My thoughts at this stage are that I am asking the 68K to access main RAM too much, as it has no instruction cache of it’s own, its instructions, and most of the data have to be dragged from main RAM, in the case of 32bit data that alone will require 2 fetches. To test the effects of reads on main RAM I removed my playback routine from the interupt handler again and replaced it with 10 nop instructions. Running the test again, I get slightly longer line than with nothing between the colour changes.. OK, I added ten 16 bit memory reads instead of the NOPS, the amount of blue increased GREATLY, it is hard to tell but instead of being only about 6cm of blue it now seems to wrap across whole scanlines! Changing this to ten 16 bit moves from one internal data register to another reduces the line back down to a much nicer 8cm sized line…
So.. it looks like I am going to have to figure a way of doing this without the simplicity and comfort main RAM was giving me. Good job I do this for the fun of the challenge really 🙂