Extensible Loader

As could be seen in the previous thread, there are many different and useful things that the Loader could do: Flash python scripts, flash GB ROMs directly, show thumbnails, long file names, author names, licences, etc.
The only problem is space. Whatever space the loader takes, it takes away from games.
So a custom loader should…

  • take up less space
  • do more stuff
  • be fully compatible with the current bins and loader
  • be easy to install and revert to the vanilla loader
  • be quick, simple, fancy, look good, bake a cake…

This still just a bunch of concepts I’m experimenting with, but here’s what I’ve been thinking about.

When activated, the loader will execute “boot.pex” from the SD card.
PEX being a “Pokitto EXecutable”. That’s about it, really.

Now for a slightly more detailed overview…

Pokitto Executable/DLL File Format - (.POK? .PEX? .PDL?)

The file format is composed of a series of sections. In general, they are all optional, can be in any order, and must start with a 4-byte type id and a 4-byte length.

Section Types:

  • 0 - Padding. Used for aligning the following section to a 512-byte boundary.
  • 1 - Index. Acts as a directory for quickly finding other sections without having to scan the entire file. Should be the first section, ideally.
  • 2 - Code Offset Address. Defines what address the following Code section should be copied to.
  • 0x10008000 - Code. Only section that has no length after the type id, therefore must be last.
  • 3 - Offset & Code. Contains offset address and code. Can appear multiple times.
  • 4 - Library Entry Point. Libraries have no NVIC, the Entry Point should return a struct with function pointers.
  • 5 - Code CRC. Used to detect if the game is already in Flash.
  • 6 - HAT compatibility [hat name][flag]. Where flag indicates: 0-incompatible, 1-compatible/optional, 2-required (won’t work without the hat)
  • 7 - long name (text)
  • 8 - author (text)
  • 9 - description (text)
  • 10- screenshot image, 110x88, 2bpp
  • 11- title logo image, 100x20, 2bpp
  • 12- icon image, 24x24, 2bpp

The Loader

In order to free up flash, it is desirable that the loader be as small as possible. I’d like to reduce it to the bare minimum: system initialization, FS API, PEX parser API. Much like the ROM APIs built-in to the processor, the FS and PEX APIs will be made available to games.
This means we won’t need to have two PFFS implementations (the loader’s and the game’s) in flash all the time. Games in the PEX format can safely assume the FS API will be available, since they can’t be loaded using the Flash button.

After basic initialization, the loader is capable of only one thing: loading boot.pex. If you rename your game to that it will get copied to flash as soon as you “Press C for Loader”, but that’s not very useful. Unlike BINs, PEX files can specify exactly where code should be loaded. Not just flash, code can be loaded to RAM (almost instantly). If you enter the loader then reset the Pokitto, the last game will still be there. This also allows for code to be loaded in run-time like a DLL (.PDL?).

Boot.pex will use the FS API present in flash to scan (a directory in) the SD card to find plugins in the form of PDLs. These would do the actual loading. It would then be possible to open various different file types: bmp, txt, bin, hex, py, gb, movies…
In the case of GB files, for example, the parser would first copy the emulator (if it isn’t already present) and then copy the GB ROM into it. No online converter needed. I imagine something similar would work with Python. A BMP/Movie loader wouldn’t actually flash anything, it would just show the BMP.

Some details intentionally vague since it’s too early to know for sure. Others, because I’m still trying to figure them out, like how to draw the UI in a plugin or the bit about baking cake.


cool name but wouldent that be confusing with the pex extension header (io pins)?

anyway the thing you want do seems like a single file to parse and load
i know pffs dousnt suport it but wouldent a “app container” (a .pok folder with the bin, images, save files, and your discriptor file) be easier?

that discriptor could just have a flag wich utility it wants to use if its not a binary
it be cleaner if also had an utility folder

this loader you sugest dous mean you have to have an sd card to boot the device

1 Like

Possibly… Nothing is set in stone yet, it’s just a “working name” until somebody thinks of something better.

Heh, sounds like something Apple would do. :stuck_out_tongue:
It’s certainly possible to use a folder to store everything. It’s just a different kind of container. I feel that using a single file would be cleaner, easier to distribute and less error-prone (it’s harder for the end user to mess up a single file than a directory).

So, for GB ROMs, each one would need to be in an individual folder with its own descriptor, each pointing to the emulator in the utility folder. Wouldn’t it be easier for the end user if the loader could be extended to support custom files?

You’d need an sd card to run the loader, which only makes sense with a card anyway. You’d be able to boot the game normally without a card.

EDIT: Another thing I want to do with this idea is to move all the UI code out of the loader to free up space and to allow different UIs (skins).

1 Like

yea i mean i made pokitto.app for the mac so dont sue me XD it works and is a clean look for the user, also potential to contain the file io a little since developer owns that folder and can create whatever “save.txt” file they want witout acidently overiding someone else

the thing then is the loader should know all the file stypes and that may grow in time, just having the name of the binary that needs to be preloaded would make more sence in my opinion, you might have multiple visual novel engines and the game code might both be a txt file or something, this could

i defenetly agree with that, i would even like to see the pokitto loge to be just loaded from sd card over always bieng on flash, though i think jonne wont like that much

thing that comes to mind for me would be a Pokitto Interpreted Language thats super lightweight to parse.
i have been talking about this before a couple times and had been experimenting with this ( i was calling it .PIL) but i never gotten anything great working, though @Pharap always said stuff about bytecode interpreters
it could all be streamed in with pffs and with maybe only one draw comand (dirrectdraw) could make the file navigator and more

That’s something I’m trying to avoid. I want to put loaders for all these extra file types on the SD card as separate plugins. The result could be something like this:

Each icon in the horizontal line would be a loader plugin. You can have an arbitrary amount of plugins. Each expands into a list of files that it can open. You can have more than one loader capable of reading a certain file type.

This could be done by an interpreter, as you suggested, but it would take up far more space in flash than simply loading native code.


do you know of a way to execute from ram?
my concern is if ech time you start the device your overiding the flash multiple times (overiding bootloader, overiding with file system, overiding with emulator or game)


I think having a different extension (.pex is fine by me or .pop or .pog - short for pokitto program or pokitto game, or whatever) is the right way to go

As soon as I saw the new topic I was kind of expecting to see a working bin here … but I know we have all been spoilt by your skillz so far :wink:


This sounds great!

A good idea.

I am not sure if I understand your main idea fully, but this how I think it works:

  1. Boot.pex shows the list of files. It can show icon, screenshot, etc. too if the file is *.pex, right(?). For other files, e.g. *.py, *.bmp, it just shows the default icon, if anything.

  2. When you select a list item that is a pex file, Boot.pex flashes it and makes a reset(?). After that the pex file can be run. Normal Pokitto binary files can be converted to pex files in PC, and put to SD card.

  3. When you select a list item that is e.g. bmp file, Boot.pex searches for the PDL that can handle the file, flashes it and the bmp file to rom(no reset needed?), and gives control to the pld program. The pld then shows the bmp image

Does this sound about correct?

MicroPython can either load a py or mpy(bytecode) file to ram, or use frozen rom file (bytecode). The frozen rom file can be executed in-place, no need to load it to ram.

In theory, just copy code to RAM and call it. I’m still experimenting with this, the tricky part is making useful code that fits strict size limits and doesn’t clash with other libraries (bss sections can’t overlap). Right now I’m making a screen mode library that fits in SRAM1 (2kb).

The flash will be written to less than it is now. The only time it writes to flash is once you pick a game. If you select a GB ROM, for example, it will check if the emulator is already in flash before copying it. In that case it would only copy the ROM itself. Once that is done, it will reset and it won’t write to flash again.

Hehe, I think I’ll go with .pop.

It sure is weird making a topic with a bunch of unproven concepts and no bin to show. I just wanted to make sure this is something the community would like before investing too much effort into it.


  • boot.pex will search for the pdls it can use and it lets the user select one. So the first thing you see will be a menu like this: [Python] [Gameboy] [Games] [Text] [Images]. Each being a separate pdl.
  • Once you select a loader, boot.pex will load that PDL into RAM and search for files that it can read.
  • For each file, boot.pex will request an icon/name/screenshot from the selected loader. If there is none, the loader can return a default.
  • Once you select a file, boot.pex will request the loader PDL to load it.
    – In the case of a BMP, I was thinking of not writing anything to flash here. The Image.pdl would simply display the image instead.
    – when you select a PEX file, game.pdl flashes it and resets. There is no special treatment for pex files in boot.pex, they’re loaded by a pdl just like the other files. Yes, pex files are made in a PC and copied to the SD. A regular bin is a valid pex, though the inverse might not be true.
    – when you select an MPY, both it and micropython are copied to flash, then it resets.

MicroPython can’t run source code in rom? :thinking:

can you clearafy what you mean by that, in theory you dont hardly need any ram for screen mode if you draw directly to screen from sd card (direct 4color bitmap for fonts and sprites) ok maybe the a small amount for the pffs buffering stuff
(or just go with 565 sprites (16bit per pixel) would be slower but more colorfull since it could show all preview images in there games pallets)

maybe unrealated but on the interpreted side could you do anything practical with something like a brainfuck thing? that would be very small on the flash?

   #include <stdlib.h>
    char m[9999], *n[99], *r = m, *p = m + 5000, **s = n, d, c;
       for (read(0, r, 4000); c = *r; r++)
              c - ']' || (d > 1 || 
              (r = *p ? *s : (--s, r)), !d || d--), c - '[' || d++ ||
              (*++s = r), d || (*p += c == '+', *p -= c == '-', p += c == '>', 
              p -= c == '<', c - '.' || write(2, p, 1), c - ',' || read(2, p, 1));

Like I said, I’m experimenting and nothing is set in stone yet. I haven’t decided on which screen mode we’ll actually use. We can use direct mode, but that limits what we can do with animations. I’ll let someone better with art propose a UI mockup before deciding anything.

I'm actually testing a bytecode interpreter, but it's not meant to be turing-complete.
.code 16
.syntax unified

.global PVCOPY
.func PVCOPY
	push {r4, lr}
	ldr r4, =jmptbl
	movs r2, r0
	movs r3, #3
	ldm r2!, {r0, r1}
	cmp r0, #0
	beq exit
	ands r3, r0
	lsls r3, #2
	ldr r3, [r4, r3]
	bx r3
	pop {r4, pc}
	str r1, [r0]
	b next
	movs r3, r0
	movs r0, r1
	push {r2, r4}
	blx r3
	pop {r2, r4}
	b next

	lsrs r0, #2
	lsls r0, #2
	ldr r3, [r0]
	orrs r3, r1
	str r3, [r0]
	b next
	lsrs r0, #2
	lsls r0, #2
	ldr r3, [r0]
	ands r3, r1
	str r3, [r0]
	b next

	.word PVCOPYSET+1
	.word PVCOPYFUN+1
	.word PVCOPYOR+1
	.word PVCOPYAND+1
It runs code that looks like assembly.
.macro SET REG:req, VAL:req
.word \REG
.word \VAL

.macro CALL REG:req, VAL:req
.word (\REG)+1
.word \VAL

.macro OR REG:req, VAL:req
.word \REG+2
.word \VAL

.macro AND REG:req, VAL:req
.word \REG+3
.word \VAL

.macro END
.word 0

.align 4
	// system_LPC11U6x.c: 487

	// 488
	// 494, 495
	SET PIO2_0, 1
	SET PIO2_1, 1

	// 497 - Redundant? Default is already 0.

	// 498

	WAIT_MS 250

	// 507

	// 508

	// 511

	// 522

	// 523

	// 522

	// 525

	// 528

	// 529
	// 532

	// 534

	// 556, 559
	AND LPC_PDRUNCFG, ~(1<<10) & ~(1<<8)

	// 560

	// 561

	// 564

	// 566

	// 567

	// 569

	// 573

	// spi_api.c:77
	// spi_api.c:79

	// spi_api.c:89
	SET PIO0_9, 0x81
	SET PIO0_8, 0x81
	SET PIO0_6, 0x82

// SPI.cpp
	// disable ssp, set lbm, ms, sod, divider to 0
	AND LPC_SPI0_CR1, ~(1<<1) & ~0xD & ~0xFF00

	SET LPC_SPI0_CR0, 0x7

	// set prescaler to 0. spi_frequency:151
	OR LPC_SPI0_CR1, (1<<1) // enable ssp
// end SPI.cpp
	SET PIO1_31, 0x80

	// #define LCD_CD_PORT           0
	// #define LCD_CD_PIN            2
	// GPIO_DIR[0]:
	SET LPC_GPIO_PORT_DIR0, 0x00000004

	// #define LCD_WR_PORT           1
	// #define LCD_WR_PIN            12
	// #define LCD_RD_PORT           1
	// #define LCD_RD_PIN            24
	// #define LCD_RES_PORT          1
	// #define LCD_RES_PIN           0	
	// GPIO_DIR[1]:
	SET LPC_GPIO_PORT_DIR1, 0x01001001
	// LPC_GPIO_PORT->DIR[2] |= (1  << 2 );
	// LPC_GPIO_PORT->DIR[2] |= (0xFFFF  << 3);  // P2_3...P2_18 as output
	// GPIO_DIR[2]:

	// #define LCD_RES_PORT          1
	// #define LCD_RES_PIN           0
	// LPC_GPIO_PORT->SET[2] = 1 << 2; // backlight
	// LCD initialization

	// Reset LCD
	// (LCD_RD high = write)
	// driver output control, this also affects direction
	LCD_CMD 0x01,0x11C

	// originally: 0x11C 100011100 SS,NL4,NL3,NL2
        // NL4...0 is the number of scan lines to drive the screen !!!
        // so 11100 is 1c = 220 lines, correct
        // test 1: 0x1C 11100 SS=0,NL4,NL3,NL2 -> no effect
        // test 2: 0x31C 1100011100 GS=1,SS=1,NL4,NL3,NL2 -> no effect
        // test 3: 0x51C 10100011100 SM=1,GS=0,SS=1,NL4,NL3,NL2 -> no effect
        // test 4: 0x71C SM=1,GS=1,SS=1,NL4,NL3,NL2
        // test 5: 0x
        // seems to have no effect... is this perhaps only for RGB mode ?

	// LCD driving control
	LCD_CMD 0x02,0x0100
	// INV = 1
	// Entry mode... lets try if this affects the direction
	LCD_CMD 0x03,0x1038
	// originally 0x1030 1000000110000 BGR,ID1,ID0
        // test 1: 0x1038 1000000111000 BGR,ID1,ID0,AM=1 ->drawing DRAM horizontally
        // test 4: am=1, id0=0, id1=0, 1000000001000,0x1008 -> same as above, but flipped on long
        // test 2: am=0, id0=0, 1000000100000, 0x1020 -> flipped on long axis
        // test 3: am=0, id1=0, 1000000010000, 0x1010 -> picture flowed over back to screen

	// Display control 2
	LCD_CMD 0x08,0x0808 // 100000001000 FP2,BP2
	// RGB display interface
	LCD_CMD 0x0C,0x0000 // all off
	// Frame marker position
	LCD_CMD 0x0F,0x0001 // OSC_EN
	// Horizontal DRAM Address
	LCD_CMD 0x20,0x0000

	// Vertical DRAM Address
	LCD_CMD 0x21,0x0000
// *************Power On sequence ****************
	LCD_CMD 0x10,0x0000
	LCD_CMD 0x11,0x1000
//------------------------ Set GRAM area --------------------------------
	// Gate scan position
	LCD_CMD 0x30,0x0000 // if GS=0, 00h=G1, else 00h=G220
	// Vertical scroll control
	LCD_CMD 0x31,0x00DB // scroll start line 11011011 = 219
	// Vertical scroll control
	LCD_CMD 0x32,0x0000 // scroll end line 0
	// Vertical scroll control
	LCD_CMD 0x33,0x0000 // 0=vertical scroll disabled
	// Partial screen driving control
	LCD_CMD 0x34,0x00DB // db = full screen (end)
	// partial screen
	LCD_CMD 0x35,0x0000 // 0 = start
	// Horizontal and vertical RAM position
	LCD_CMD 0x36,0x00AF //end address 175
	LCD_CMD 0x37,0x0000
	LCD_CMD 0x38,0x00DB //end address 219

	LCD_CMD 0x39,0x0000 // start address 0
	// start gamma register control
	LCD_CMD 0xff,0x0003
// ----------- Adjust the Gamma  Curve ----------//
	LCD_CMD 0x50,0x0203	
	LCD_CMD 0x51,0x0A09	
	LCD_CMD 0x52,0x0005	
	LCD_CMD 0x53,0x1021	
	LCD_CMD 0x54,0x0602	
	LCD_CMD 0x55,0x0003	
	LCD_CMD 0x56,0x0703	
	LCD_CMD 0x57,0x0507	
	LCD_CMD 0x58,0x1021	
	LCD_CMD 0x59,0x0703	
	LCD_CMD 0xB0,0x2501	
	LCD_CMD 0xFF,0x0000	
	LCD_CMD 0x07,0x1017
	SET LCD_MPIN,   (0x22)<<3
Since it's not turing-complete, it calls functions to do things it can't do itself.
.func poll
.align 4
	movs r2, 1
	ldr r1, [r0]
	ands r1, r2
	beq 1b
	bx lr

.func wait
.align 4:
	movs r1, #10
	subs r1, #1
	bne 1b
	subs r0, #1
	bne wait
	bx lr

.macro WAIT_MS MS:req
	CALL wait, \MS
It also calls functions for things that would take up more space in bytecode than in native code.
.func lcdCmd
.align 4
lcdCmd: 				// r0 = cmd<<16 + arg
	lsrs r1, r0, #16 		// r1 = cmd
	uxth r0, r0 	 		// remove cmd from r0

	lcd_clr_cd r3, r4		// CLR_CD. r3 = pin, r4 = CLR0

	// MPIN[2] = CMD<<3
	lsls r1, r1, #3
	str r1, [r2, #8]
	movs r1, r2			// r1 = MPIN
	ldr r3, =(1<<LCD_WR_PIN)	// CLR_WR
	str r3, [r4, #4]		// [r4, 4] is CLR1. LCD_WR_PORT

	lsls r0, r0, #3			// data = data<<3
					// SET_WR
	ldr r2, =LPC_GPIO_PORT_SET0
	str r3, [r2, #4]		// [r2, 4] is SET1. LCD_WR_PORT
	movs r3, (1<<LCD_CD_PIN)
	str r3, [r2, #0]		// SET_CD. [r2, 0] is SET0.
	str r0, [r1, #8]		// MPIN[2] = data (r0)
	ldr r3, =(1<<LCD_WR_PIN)	// CLR_WR
	str r3, [r4, #4]		// [r4, 4] is CLR1. LCD_WR_PORT

					// SET_WR
	ldr r2, =LPC_GPIO_PORT_SET0
	str r3, [r2, #4]		// [r2, 4] is SET1. LCD_WR_PORT

	bx lr

.macro LCD_CMD CMD:req, ARG:req
	.word lcdCmd+1
	.word (\CMD<<16) | (\ARG)

That bytecode does all the system and LCD setup, leaving things in a ready-to-go state for the rest of the app. Still not sure if it’s a good idea, I haven’t compared it to the current C++ implementation to see if it actually saves space. All I know is: it works. The final bin does everything in system_11U6x.cpp, initializes SPI, initializes the LCD, then writes to the screen in mode 1 (code not included above) while taking up about 1.5kb in total.

That explanation makes the concept much clearer!

That’s handy!

Absolutely. I cannot figure out any downsides compared to a current implementation, but just huge advantages :slight_smile:

iirc it is not currently supported, but I think it is not hard to make it to support. The whole python program is just a huge string. However, in that case MP needs to compile the script to the bytecode and save it to ram (including bitmaps etc.), so we are much limited by the free ram size (MP interpreter needs quite much stack too).

1 Like

I’ve got the recpie for that.

  • 1 (18.25-ounce) package chocolate cake mix
  • 1 can prepared coconut–pecan frosting
  • 3/4 cup vegetable oil
  • 4 large eggs
  • 1 cup semi-sweet chocolate chips
  • 3/4 cup butter or margarine
  • 1 2/3 cup granulated sugar
  • 2 cups all-purpose flour
  • 1 tsp. vanilla extract
  • 2/3 cup cocoa powder
  • 1 1/4 tsp. baking soda
  • 1 tsp. salt
  • 1/4 tsp. baking powder
  • 1 to 2 (16 ounces each) cans vanilla frosting
  • A 20-foot thick impermeable clay layer

I agree.

I think .pbe would be better - Pokitto Binary Executable.

Call it ‘app’ anything and I’m suing.
‘app’ is a horribly overused word.

Recommended reading:

I think it depends whether you want plugins to be able to do arbitrary execution or not.
If you wanted to limit plugins to a subset of commands and speed wasn’t a concern then bytecode might be a viable approach.

I think .plp - Pokitto Loader Plugin.

The interpreter would be tiny but the programs would be unrealisitcally huge.
If an esoteric language was going to be used, something stack-based like FALSE would be a better choice.

Do you think a flavor of FALSE would be the way to go, looking into it it’s 1kb for the compiler but aperently it’s all assambly

Could we do a bytecode version of that?

That’s 1KB for the original x86 compiler, which also does zero error handling, so you can’t go by that.
The size of the code ported to ARM would be different, especially if we added a safety net to prevent it doing dangerous things.

A bytecode variation of FALSE would be pretty easy though, I write them for fun when I get bored :P
Technically FALSE is already bytecode-based,
and the instructions it uses are conviniently also printable characters.

I actually already have a basic stack-based bytecode language implemented because I was going to use it as part of a Pokitto project I didn’t get round to doing because I never got round to ordering that 3.3v-5v bridge.
It would need some modifying to be applicable to this loader though.

That recipe can’t be right, there’s nothing fish-shaped in it. :confused:

1 Like

The list of acceptable ‘garnishes’ is twice the size of the actual recipe.
I thought I’d save people the scrolling.

(Personally my favourite garnish is ‘1 cup lemon juice’ - preferably from a combustible lemon.)

1 Like

My intent in the design is, in order to minimize flash usage, whenever possible, everything on flash should serve more than one purpose:

  • The FS API will be used by the loader and games that support it.
  • The initialization routine, which is in a sort of bytecode, could also be used by the loader and games. The interpreter for that is just 32 thumb instructions.
  • The PEX API will be used by the loader to load plugins and to load the actual games. It can also be used by games that want dynamic code loading. If no games want that, at least the parser can be trivially small.

What sort of idea do people have at this point? Something similar to the original loader mockups, or something completely different?