[Suggestion]Future of PokittoLib - v2.0

crait · July 1, 2018, 6:19am

The PSP has a folder for saved data and a folder for games. Each game has its own folder inside of each. Applying that to this, maybe the saved data folder is public, but game folders are private? The author of the game can have it save to either the game folder or to the public savedata area.

Pharap · July 2, 2018, 4:10am

I’ll make a note to look into it.

Having a separate shared folder and private folder per game sort of makes sense.

If we went with that though, I think it would make sense to store metadata about a game in the shared folder and make it so both the shared and private folders are only writable by the game that owns them, but the shared folder for each game is visible to all games.

That would mean the shared folder could function as a way of signalling the presence of a game so people can do cool tricks like adding special ‘DLC’ if you’ve played both games.
(Like how TF2 gives you cool hats if you preorder other games.)

Pharap · July 2, 2018, 12:31pm

I attempted to get the PokittoLib compiling in Code::Blocks today.

Via manual editing I managed to get within 50 errors of compiling.
Via @fmanga’s makefile I got “recipe for target all failed”.

It’s a start I guess.

jonne · July 2, 2018, 12:32pm

The ARM hardware target you mean ?

Pharap · July 2, 2018, 12:33pm

Yep, trying to compile the actual ARM-targetting code through Code Blocks using the same compiler EmBitz uses.

Pharap · July 2, 2018, 12:41pm

Down to just one error: “unrecognized command lin option ‘-mwindows’”.

Never mind, solved that one.

FManga · July 15, 2018, 2:19pm

Instead of doing v2.0 as a clean-slate rewrite that would take a lot of unavailable time and energy… how about we simply introduce the concept of “interfaces” to v1.0 and gradually move the code out of the core?
The screen modes would still be in core, but if reduced to just 4 (Modes 0, 1, 2, 13) it’s not so bad.
Interfaces would then be responsible for providing adaptors for Gamebuino/Arduboy compatability, and PokittoStarter.
You might have noticed that tiled mode is absent from the list of modes I mentioned above. Instead of trying to figure out what an ideal tile mode is supposed to be, I propose we rename what we have now to better reflect what it is: a mode with no framebuffer. Following the advice that “easy things should be easy, and hard things should be possible”, mode zero becomes simply the low-level mode that allows more direct access to the LCD. Tile-based engines of differing styles and restrictions can build on top of that in an interface or a separate library. This way it also becomes the logical mode to use for things that don’t need a built-in framebuffer (Gameboy/SegaVMU/Bitsy).

One problem left to solve is text: how can something that uses mode 0 provide its own text-rendering so that the existing print functions continue working? And how do we fix the FPS printer in all the modes without breaking optimizations in the process?

Hanski · July 15, 2018, 2:39pm

I have already fixed the FPS printer for the optimized mode 2. Same way it can be done for all the buffered modes, i.e. skip copying top 8 scanlines. I did not notice any difference in real FPS value, with or without the FPS printer ( because the value is rounded to an integer).

spinal · July 15, 2018, 4:04pm

I was just thinking about the screen modes today. I did a quick test and although I can’t quite get a recreation of mode1 working correctly, the idea is there.
Basically, merge all of the buffered screen modes into 1 single mode.
Like this
mixMode.bin (67.2 KB)
Press ‘A’ to cycle through standard mode 13, mode 1 (not fully working) and a new mode based on the c64 wide pixel mode, with 110x176x16.

That way, people can change screen mode at run time to have say, a hi-res lo-colour title screen and lo-res hi-colour gameplay or whatever.

jonne · July 15, 2018, 4:09pm

I love, absolutely love the wide-pixel idea

jonne · July 15, 2018, 4:28pm

@spinal

Dynamic screen mode brings a host of problems, biggest of them are:

memory management. How do you do it? You would always have to prepare for the worst case (19kB). Allocating dynamically is slower & can bring runtime stack overflow issues that are hard to debug
mode detection overhead. Every setpixel and other operation needs to check which mode we currently are using

spinal · July 15, 2018, 4:34pm

I was just thinking that at least some of the buffered modes all use the same size buffer, which is why I chose the modes in the demo for testing. so that wouldn’t be an issue. But for the mode detection, that would really only be done for rendering to lcd. Ideally the bitmap functions should be redone so that the image itself would know its own bitdepth (as well as its width/height) so that any image can be drawn to any buffered screen mode, even if that needed to be done manually by the coder, drawBitmap(x,y,image,bitdepth); for example.

jonne · July 15, 2018, 4:39pm

I understand your idea. But for example the setPixel function is used alot. How do you get away from the problem, that if bit depth is variable, setPixel needs to know, per pixel, the current bit depth. With a pre-processor setting there is zero overhead.

Edit: this was a design choice from the beginning. I didn’t know someone would be able to speed up the screen by a factor of ten.

Pharap · July 15, 2018, 5:09pm

I like the idea of a specific mode with no framebuffer that the other modes could perhaps be built on top of.

I think printing should be done through separate classes that in turn rely on different screenmode functions.

We could have a common intreface that all printing classes have, similar to Arduino’s Print class.

This I’m not so on-board with.

I think the different screen modes can and should be separated.

If we went down the ‘try to adapt the existing library’ approach then I think the easiest way would be to have a setup like:

#pragma once

#include <cstdint>
#include <cstdlib>

enum class ScreenModeId
{
	Hires,
	Lores,
};

template< ScreenModeId Mode > class ScreenMode;

// Possibly in a separate file

class ScreenMode<ScreenModeId::Hires>
{
public:
	constexpr static size_t Width = 220;
	constexpr static size_t Height = 176;
	constexpr static size_t BitsPerPixel = 2;
	constexpr static size_t PaletteSize = 4;

private:
	static char buffer[(Width * Height / 8) * BitsPerPixel];
	
	static uint16_t palette[PaletteSize];
	
public:

	static void setPixel(size_t x, size_t y, size_t colourIndex);
	
	static void setPaletteColour(size_t colourIndex, uint16_t colourRgb565);
};

// Possibly in a separate file

class ScreenMode<ScreenModeId::Lores>
{
public:
	constexpr static size_t Width = 110;
	constexpr static size_t Height = 88;
	constexpr static size_t BitsPerPixel = 4;
	constexpr static size_t PaletteSize = 16;

private:
	static char buffer[(Width * Height / 8) * BitsPerPixel];

	static uint16_t palette[PaletteSize];
	
public:

	static void setPixel(size_t x, size_t y, size_t colourIndex);
	
	static void setPaletteColour(size_t colourIndex, uint16_t colourRgb565);
};

And then the conditional code for Display is just:

#if !defined(POKITTO_SCREEN_MODE)
#define POKITTO_SCREEN_MODE (POKITTO_SCREEN_MODE_HIRES)
#endif

#if (POKITTO_SCREEN_MODE == POKITTO_SCREEN_MODE_HIRES)
using Display = ScreenMode<ScreenModeId::Hires>;
#elif (POKITTO_SCREEN_MODE == POKITTO_SCREEN_MODE_LORES)
using Display = ScreenMode<ScreenModeId::Lores>;
#endif

I’d have to double check but I’m reasonably sure that by doing it that way the static variables that aren’t used won’t be allocated.

The important benefit of doing is this way is that what the end user gets is a simple single unified interface for all the buffered raster screenmodes:

Display::setPaletteColour(0, 0x0000);
Display::fillScreen(0);
Display::setPaletteColour(1, 0xFFFF);
Display::drawPixel(12, 14, 1);
Display::present();

Obviously this is only extensible when you have access to the source though, so it wouldn’t be good for library writers.

It would be possible to make it more easy for library writers to add more screen modes by templating the Pokitto class implementation, but it would mean more templates so I’m not sure how people will react to that.

In general I get the idea that templates scare people and I have no idea why :P

Everything @jonne said and more.

Computers usually achieve this with dynamic allocation or allocating the worst-case-scenario sized buffer (or by having memory built into the GPU).
The Pokitto doesn’t have the luxury of gigabytes of RAM, so heap fragmentation is a real problem.

If the bitdepth isn’t precomputed, that adds overhead for calculating the indices and shifts.
Possibly not much if done right, but also possibly enough to be noticable.

If you did go down that route then the best approach would be to use inheritance and put up with the overhead of virtual functions.

(Which reminds me of an alternate way to tackle the screenmode API that I mentioned earlier.)

jonne · July 15, 2018, 5:20pm

I’d like to hear in particular Felipe’s opinion in this.

There are two different issues at stake.

Dynamic screenbuffer allocation. My opinion is absolutely not. This introduces the possibility of such difficult memory bugs, that users with no hardware debugger will never know what hit them when the program hangs. With a hardware debugger, you can see the type of hard fault generated.
Bit depth switch overhead for drawing primitives. Functions like print will be hit hard by this, because for example in the case of the tracker program, there are hundreds of characters, all using the setPixel function. This would require rewriting all graphics primitives to be independent of setPixel. @FManga, opinions?

FManga · July 15, 2018, 5:39pm

I wouldn’t use any dynamic allocation at all on the Pokitto. But, as @spinal said, if all of the merged modes need a same-sized buffer, there wouldn’t be any need to put it in the heap.

The second issue is a bit harder. Ideally, from the performance standpoint, the drawing primitives shouldn’t be using setPixel. There should be a “setPixelRaw” that has no ifs inside. Functions like drawRectangle would do the screen-bounds and bpp checking before entering their loops and setPixelRaw would trust everything to be resolved by that point. In this scenario, setPixelRaw would be a function pointer to setPixelRaw1bpp, setPixelRaw2bpp, etc. and it would actually be faster than what we have now. On the other hand, it brings two problems: implementing this would take a lot of work, and it takes up a lot more flash space, even if you only use one of the merged modes.

Honestly, I don’t think the feature would be used enough to justify the cost.

Edit: The above only applies to the current lib. For a rewrite, it’s going to be a lot of work anyway, so might as well.

Pharap · July 15, 2018, 5:52pm

Even if they didn’t need a buffer of the same size, they could all make use of the same buffer, but the buffer would have to be the largest possible.

The alternate way is:

// Hires.h
#pragma once

#include <cstdint>
#include <cstdlib>

class HiresScreenMode
{
public:
	constexpr static size_t Width = 220;
	constexpr static size_t Height = 176;
	constexpr static size_t BitsPerPixel = 2;
	constexpr static size_t PaletteSize = 4;

private:
	uint16_t palette[PaletteSize];
	char buffer[(Width * Height / 8) * BitsPerPixel];	
	
public:

	void setPixel(size_t x, size_t y, size_t colourIndex);
	
	void setPaletteColour(size_t colourIndex, uint16_t colourRgb565);
};

// Lores.h
#pragma once

#include <cstdint>
#include <cstdlib>

class LoresScreenMode
{
public:
	constexpr static size_t Width = 110;
	constexpr static size_t Height = 88;
	constexpr static size_t BitsPerPixel = 4;
	constexpr static size_t PaletteSize = 16;

private:
	uint16_t palette[PaletteSize];
	char buffer[(Width * Height / 8) * BitsPerPixel];
	
public:

	void setPixel(size_t x, size_t y, size_t colourIndex);
	
	void setPaletteColour(size_t colourIndex, uint16_t colourRgb565);
};

// StaticScreenMode.h

template<typename Implementation>
class StaticScreenMode
{
private:
	static Implementation implementation;
	
public:
	static void setPixel(size_t x, size_t y, size_t colourIndex)
	{
		this->implementation.setPixel(x, y, colourIndex);
	}
	
	static void setPaletteColour(size_t colourIndex, uint16_t colourRgb565)
	{
		this->implementation.setPaletteColour(colourIndex, colourRgb565);
	}	
};

// Pokitto/Display.h

#if (POKITTO_SCREEN_MODE == POKITTO_SCREEN_MODE_HIRES)
using Display = StaticScreenMode<HiresScreenMode>;
#elif (POKITTO_SCREEN_MODE == POKITTO_SCREEN_MODE_LORES)
using Display = StaticScreenMode<LoresScreenMode>;
#else
using Display = void;
#endif

And then if people really wanted to:

// AbstractScreenMode.h
class AbsstractScreenMode
{
public:
	virtual ~AbsstractScreenMode(void) {}

	virtual void setPixel(size_t x, size_t y, size_t colourIndex) = 0;
	
	virtual void setPaletteColour(size_t colourIndex, uint16_t colourRgb565) = 0;
};

// DynamicScreenMode.h

template<typename Implementation>
class DynamicScreenMode
{
private:
	static Implementation implementation;
	
public:
	
	void setPixel(size_t x, size_t y, size_t colourIndex) override
	{
		this->implementation.setPixel(x, y, colourIndex);
	}
	
	void setPaletteColour(size_t colourIndex, uint16_t colourRgb565) override
	{
		this->implementation.setPaletteColour(colourIndex, colourRgb565);
	}	
};

So then it would be possible to do things like:

AbstractScreenMode currentScreenMode;

currentScreenMode = new DynamicScreenMode<HiresScreenMode>();

currentScreenMode->drawPixel(0, 0, 0);
currentScreenMode->present();

delete currentScreenMode;

currentScreenMode = new DynamicScreenMode<LoresScreenMode>();

currentScreenMode->drawPixel(0, 0, 0);
currentScreenMode->present();

delete currentScreenMode;

And all the expensive stuff would be opt-in.

FManga · July 15, 2018, 6:06pm

I meant short-term. At the very least we could move the mode-specific code into separate cpp files and use a single #ifdef per file instead of functions with a bunch of ifdefs inside. This could be done right now without breaking anything.

I’d prefer the template version for a clean-slate v2.0. Both ways allow for a merged mode (move the buffer out of the class and into the ifdef).

FManga · July 18, 2018, 1:48pm

I sent a PR with a modified implementation of bufferChar. It is related to my previous post above, but is useful even if we don’t implement the merged mode. For those who are used to other processors it might be a bit counter-intuitive that it is faster this way. The concept can be applied to other drawing functions.