Improving FPS

That still shows a little noise. I’m sure most people wont mind though.

I’m assuming the SD library is my main bottleneck in the following experiment…
mode13_stream.bin (63.3 KB)

movie.zip (2.5 MB)

Just unzip movie.dat to the SD card root and load up mode13_stream.bin and have your sock blown off by the incredible 8fps silent movie!

2 Likes

PFFS is much faster than SDFS. Which one you are using?

Whichever one is default in the current pokittolib?

Depends on the API you are using:

  • fopen, fclose etc. are using SDFS.
  • PokittoDisk.h (FileOpen(), FileClose()) is using PFFS.

Currently fopen etc.

I previously tested the speed of the both file systems by reading a 200 kb file in 1 kb blocks:

  • PetitFatFS: 264 kb/s
  • SDFileSystem: 80 kb/s

I am using Samsung 1GB MicroSD card in Pokitto. The speed might depend on the SD card also.

I’ve never (knowingly) used pffs before, have I converted the following correctly? it doesn’t seem to be working…

[code]
int main(){
game.begin();
game.display.persistence=1;
game.setFrameRate(999);

int temp;
pokInitSD(); // Call init always.

while (game.isRunning()) {

// FILE *handle = fopen("/sd/movie.dat", “rb”);
//if (handle){
if (fileOpen("/sd/movie.dat",FILE_MODE_BINARY)) {

    unsigned char col[3];
    uint16_t tempPal[256];
    for(temp=0; temp<256; temp++){
        //fread(&col[0], sizeof(char), 3, handle);
        fileReadBytes(&col[0], 3);
        pal[temp] = (col[0]>>3) | ((col[1] >> 2) << 5) | ((col[2] >> 3) << 11);
    }
    game.display.load565Palette(&pal[0]); // load a palette the same way as any other palette in any other screen mode
    bool stillGoing=1;
    while(stillGoing==1){
        if(game.update()){
            //if(!fread(&game.display.screenbuffer[0], 1, 110*88, handle))stillGoing=0;
            if(!fileReadBytes(&game.display.screenbuffer[0], 9680))stillGoing=0;
        }
    }
    fileClose();
//} // if handle

} // file open

}

return 1;
}[/code]

… to make this a useful idea, I should probably think about reading less data. Sound might be an issue also.

Looks ok to me.

you have sound?

weird, I’m just getting a green screen at 5fps…

sound… Nope, wouldn’t have a clue where to start on the pokitto. My DS version of this used jpg for the images and raw wav for sound (interleaved with the images). It worked quite well.

[edit] got it working, now at 21fps! Time to think about sound I guess.

https://t.co/NDrlTQbO0s

I tried two different cards, got ~19fps with your demo. Looks good enough for cutscenes in games!
To get rid of the noise completely, I think you need to replace the s in TGL_WR(s) with something that would cost a cycle. I’m not sure if an inline asm NOP is best. Maybe something like this?
*LCD = *s; TGL_WR(s+=2);TGL_WR(s--);
I haven’t tested it, you might have to pick between noise and speed.

1 Like

I think I’d prefer an inline nop since the intent is clearer.
Trying to abuse arithmetic is just going to confuse people.

I would too, but it seems the inline asm confuses the compiler, which is worse. I haven’t actually checked the disassembly, but using a NOP gives an unexpected hit to the FPS.

Hrm, possibly because it acts as a sort of memory barrier preventing the rearranging of instructions?

Does the pause have to be just one cycle?
I’m wondering if calling an empty function marked with __attribute((noinline)) would work or if a function call would be too many cycles.

Or perhaps doing:

volatile int nop __attribute__((unused)) = 0;
++nop; // Optional

Would force GCC to create and use the variable?
It might cause similar issues though if GCC ends up having to dump a register it’s using.

Or perhaps:

// global
static int nopVar = 0;
void __attribute__((always_inline)) nop(void)
{
  ++nopVar;
}

I think the fact it’s a global variable should mean that the compiler can’t optimise the operation away,
but it could probably still reorder it.

I haven’t looked into the rules regarding GCC and inline asm on ARM yet, so I have no idea. Maybe it considers a set of registers to get clobbered and has to reload them?
It doesn’t have to be one cycle, but it adds up.
We’re getting to a point where the high-level language is getting in the way and writing these routines in assembly would be best, both in terms of performance and legibility.

Alignment is another possibility. I remember reading thatif the function/loop is not aligned you get padding.

The fundamental rules of inline asm on GCC are the same for all chips, what changes is the registers, the available instructions and the register aliases, e.g. on AVR r means any register and d means an upper register.

It shouldn’t clobber a register unless you explicitly state it to be an output operand or to be clobbered.

Considering this is rendering code, I’m inclined to agree.
As long as the external interface is C++ friendly,
the implementation can be pure assembly for speed sake,
providing it’s well-commented.

Functions do have a memory alignment, but if it pads I would assume it’s just padding with more nops,
which shouldn’t be a major issue.

But noticeable in tight execution speed

ARM instructions are 32 bits wide, but I can’t find any information about function alignment.

I found a thing discussing the alignment of structs which said that the largest field being int causes 4-byte alignment and the largest field being long long causes 8-byte alignment, but nothing beyond that.

I’m erring on the side of memory barriers/formal no-tampering rules though, see this SO question that demonstrates how something as simple as asm("# im in ur loop"); can have an impact because of the side effects of the presense of an asm block (note that the asm block here is implicitly asm volatile).

Oh yeah, did I already mention inline asm can not be used with mbed online ide :wink: ?

The Pokitto uses Thumb, not the full 32-bit ARM IS, so it’s just 16 bits wide. I vaguely remember the TRM mentioning 4-byte instruction alignment for certain ops.

What do you mean? You’re using inline asm in the PokittoLib:

#define CLR_WR { LPC_GPIO_PORT->CLR[LCD_WR_PORT] = 1 << LCD_WR_PIN;__asm("nop");}//__asm("nop");}//
1 Like