This is an idea I had and wanted to test with a game, but my hands are full with ProjectABE and the Bitsy port so I’ll document it here, in case somebody else wants to give this a shot.
According to the M0+ TRM, the following regions of the address map are marked as executable:
0x60000000 - 0x9FFFFFFF // nothing?
0x20000000 - 0x3FFFFFFF // 2kb at SRAM1 + 2kb at USB SRAM
0x00000000 - 0x1FFFFFFF // flash + 32kb at SRAM0
In theory, one can simply write thumb2 code into SRAM0 and call it with a function pointer. The problem is that SRAM0 is where the heap, framebuffer and the stack are, so there’s not much space for code there.
But what about SRAM1 and USB SRAM? As far as I can tell, they are both disabled by the PokittoLib in begin()
, but can be enabled easily enough.
What if we were to compile small modules for things like AI and swap them as-needed into SRAM1 and USB SRAM? By patching the mbed linker script, I think it’d be possible to make GCC emit code that would run in those regions, and it could share the main program’s stack and heap. To avoid code duplication, we’d pass the module a struct with function pointers to things provided by the game’s engine and malloc/free/etc.
For an RPG, the main engine would go in flash and AI for each NPC could go in a separate DLL in SRAM1. NPC states could be stored in USB SRAM. That leaves all of SRAM0 to the game engine. If necessary, this can be broken down further: each state of a particular NPC can be put into a separate DLL, allowing for indefinite amounts of NPCs with arbitrarily complex behaviors. The engine would have support functions (like, say, A*) and the NPC’s AI would only need to be loaded/executed when a decision needs to be made (done walking, what now?) On average, swapping would occur less than once per frame, even with multiple NPCs on-screen.
Compared to having an interpreter, this approach has a few advantages: loading up a DLL is simply a matter of reading a file to memory, and module performance might actually be faster than the code running in flash (not sure, I vaguely recollect an article saying that executing from flash is slower than SRAM).