Improving FPS

The fundamental rules of inline asm on GCC are the same for all chips, what changes is the registers, the available instructions and the register aliases, e.g. on AVR r means any register and d means an upper register.

It shouldn’t clobber a register unless you explicitly state it to be an output operand or to be clobbered.

Considering this is rendering code, I’m inclined to agree.
As long as the external interface is C++ friendly,
the implementation can be pure assembly for speed sake,
providing it’s well-commented.

Functions do have a memory alignment, but if it pads I would assume it’s just padding with more nops,
which shouldn’t be a major issue.

But noticeable in tight execution speed

ARM instructions are 32 bits wide, but I can’t find any information about function alignment.

I found a thing discussing the alignment of structs which said that the largest field being int causes 4-byte alignment and the largest field being long long causes 8-byte alignment, but nothing beyond that.

I’m erring on the side of memory barriers/formal no-tampering rules though, see this SO question that demonstrates how something as simple as asm("# im in ur loop"); can have an impact because of the side effects of the presense of an asm block (note that the asm block here is implicitly asm volatile).

Oh yeah, did I already mention inline asm can not be used with mbed online ide :wink: ?

The Pokitto uses Thumb, not the full 32-bit ARM IS, so it’s just 16 bits wide. I vaguely remember the TRM mentioning 4-byte instruction alignment for certain ops.

What do you mean? You’re using inline asm in the PokittoLib:

#define CLR_WR { LPC_GPIO_PORT->CLR[LCD_WR_PORT] = 1 << LCD_WR_PIN;__asm("nop");}//__asm("nop");}//
1 Like

Right, I didn’t know that. I’ve added that to the technical spec thread since that’s probably important.

If @jonne tried asm volatile and it didn’t work, maybe it’s just __asm that works on mbed online?

Apparently:

When writing code that can be compiled with -ansi and the various -std options, use __asm__ instead of asm (see Alternate Keywords).

Though __asm__ isn’t the same as __asm so maybe __asm is an mbed thing?

How did you fix it?

1 Like

It seems fileopen returns false if it works, not true. So I was only attempting to render is the file didn’t open.

fileOpen doesn’t return bool, it returns a uint8_t, and signals success with 0 and failure with 1.

This is one of the many reasons I hate implicit conversion between integer types and bool, if conversion was explicit, you’d have got a compiler error telling you that were trying to put an integer expression in an if instead of a boolean expression.

At least that seems to be the case.
On the simulator it all seems to work fine:

But the hardware code is more than a little confusing.
It’s full of comments and empty functions:

https://github.com/pokitto/PokittoLib/blob/master/Pokitto/POKITTO_LIBS/FileIO/FileIO_HW.cpp#L21

more reason for that big rewrite effort

so how is that video working? is it like interlaced or compressed in some way? or just full bitmaps?

1 Like

Uncompressed bitmap data using a single 8bit palette. I’m quite amazed that the SD read and LCD display are fast enough. (Explanation link.)

I hadn’t noticed before, but:

What’s wrong with doing this?

bool stillGoing = true;
while(stillGoing){

Definitely.

At one point I’d written a nice object-oriented wrapper for some of the file functions, but I lost it somewhere (and may have accidentally deleted it).
I was aiming to design it so you didn’t have to remember call close because the object’s destructor would call it when the object fell out of scope.

If it’s only supposed to play one way, I’d hope it was at least using delta compression.

Nothing I guess, just how I wrote it.

On a related note, is there any example of a pokitto program playing a sound sample from data in an array?

.wav playing demo from the lib examples:

If ROM files are read only, would it not be possible to leave them on the SD and stream the data from there? or would seeking slow tings down too much?

It would be really slow, and getting it to full speed is being challenging enough. Besides, doing it this way I can use defines to optimize for the specific mapper in the cartridge. Also, the pokitto loader already acts as a menu for picking a game, putting another one in the emulator would be redundant.

1 Like

@spinal: I think I found a way to get rid of the noise without wasting cycles. Try this:

#define TGL_WR(OP)							\
  *reinterpret_cast< volatile uint32_t *>(0xA0002284) = 1 << LCD_WR_PIN; \
  OP;									\
  *reinterpret_cast< volatile uint32_t *>(0xA0002204) = 1 << LCD_WR_PIN;

 void Pokitto::lcdRefreshMode13(uint8_t * scrbuf, uint16_t* paletteptr, uint8_t offset){
   uint16_t x,y;
   uint32_t scanline[110]; // read two nibbles = pixels at a time
   uint8_t *d;
   uint32_t *s;

   write_command_16(0x03); write_data_16(0x1038);
   write_command(0x20); write_data(0);
   write_command(0x21); write_data(0);
   write_command(0x22);
   CLR_CS_SET_CD_RD_WR;
   SET_MASK_P2;

   volatile uint32_t *LCD = reinterpret_cast< volatile uint32_t * >(0xA0002188);

   d = scrbuf;// point to beginning of line in data
   for(y=0;y<88;y++){

     s = scanline;

     for(x=0;x<110;x+=10){
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
       *LCD = *s = paletteptr[(*d + offset)&255]<<3; TGL_WR(s++);TGL_WR(d++);	
     }

     s = scanline;
     uint32_t c = *s;
     for(x=0;x<110;x+=10){
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
       *LCD = c; TGL_WR(s++);TGL_WR(c=*s);
     }
     
   }
   
 }
3 Likes

Still a line of dots at the very top and across the middle.mode13.bin (37.0 KB)

What’s in “My_settings.h”?

Forgot to mention that the 2D scroller is named as “Examples\Scroll2Layers”. Jonne has merged that to PokittoLib.

1 Like