Display bugs?


#1

The circumstances that brought these things to my attention might be my fault (bugs in the emulator and/or test cases and/or me being a noob and/or low on caffeine), so I’m not sure if they really are bugs in the library.


PROJ_TILEDMODE 1 doesn’t define POK_BITFRAME

I had to add #define POK_BITFRAME 0 to My_settings.h for SoftVMS to compile.
It wasn’t necessary before?


Pokitto::lcdRectangle(1,1,5,5) doesn’t draw a square

Expected: Either a 4x4 or 5x5 square, not a 5x4 rectangle.
Is it really supposed to have an exclusive upper bound for Y, but inclusive for X?

void Pokitto::lcdRectangle(int16_t x0, int16_t y0, int16_t x1, int16_t y1, uint16_t color) {
    	int16_t temp;
	if (x0>x1) {temp=x0;x0=x1;x1=temp;}
	if (y0>y1) {temp=y0;y0=y1;y1=temp;}
	if (x0 > POK_LCD_W) return;
	if (y0 > POK_LCD_H) return;
    if (x1 > POK_LCD_W) x1=POK_LCD_W;
	if (y1 > POK_LCD_H) y1=POK_LCD_H;
	if (x0 < 0) x0=0;
	if (y0 < 0) y0=0;

	int32_t max=(x1-x0)*(y1-y0);
	
	if( max ){
		setWindow( y0, x0, y1-1, x1-1 );
	
	    write_command(0x22);    
	    setup_data_16(color); // setup the data (flat color = no change between pixels)
	    CLR_CS_SET_CD_RD_WR; // go to vram write mode
	    while(max--){
	        CLR_WR;SET_WR; //CLR_WR;SET_WR;//toggle writeline, pokitto screen writes a column up to down
	    }
	    
		setWindow(0, 0, 175, 219);
	}
}

drawRow with x0<0 and x1>width draws nothing.

Expected: draw the visible portion of the line. drawColumn has the same issue.
Not sure if bounds should be exclusive/inclusive, suggested implementation might need adjustment.

void Display::drawRow(int16_t x0, int16_t x1, int16_t y){
    if ((uint16_t)y>=height) return; //completely out of bounds
    if (x0>x1) {
            int x=x0;
            x0=x1;
            x1=x; // swap around so that x0 is less than x1
    }
    if (x0<0) x0=0;
    if (x1>=width ) x1=width-1;
    if (x0>x1) return; //completely out of bounds

    for (int x=x0; x <= x1; x++) {
        drawPixel(x,y);
    }
}


I also find it odd that there’s int16_t everywhere. Is there any reason for this?
ARM can only pass 32-bits around, so it has to add extra instructions to sign-extend / truncate and I can’t think of any advantage to using smaller types on it. It just results in marginally larger/slower code.

edit: fix typo


#2

I have noticed something similar for the function Display::fillRectangle(int16_t x0,int16_t y0, int16_t w, int16_t h): there is an inclusive upperbound for Y and an exclusive upperbound for X


#3

#4

I hadn’t spotted the W instead of H typo. Looks like your PR is unrelated to the issue I reported.


#5

Merged just now. Thanks!


#6

I’ll look into the rest later if I get chance or nobody beats me to it.

I agree that some of the types are a bit of an odd choice.
It also strikes me that std::swap could be taken advantage of, as well as std::min and std::max.

e.g.

void Display::drawRow(int32_t x0, int32_t x1, int32_t y)
{
	if (y < 0 || y >= height)
		return; //completely out of bounds
		
	if (x0 > x1)
		std::swap(x0, x1);
		
	std::max(x0, 0);
	std::min(x1, width - 1);
	
	if (x0 > x1)
		return; //completely out of bounds

	for (int32_t x = x0; x <= x1; ++x)
		drawPixel(x,y);
}

I’m half and half about whether int or int32_t is the better option here, but I’m convinced that only one should be chosen.


#7

I only just realised the code you posted was suggested fixes rather than what the code already looked like.

That explains why I wasn’t seeing the bugs you were describing when I looked at the code (which in turn lead me to discovering that other bug).

Have they both been tested on actual hardware?


I can see at least one possible bug in lcdRectangle.
If someone did this:

lcdRectangle(0, -5, 5, -1, 0);

The if(y0>y1) causes -1 and -5 to trade places, making y0 = -1 and y1 = -5.
if (y0 > POK_LCD_H) return; fails.
if (y1 > POK_LCD_H) y1=POK_LCD_H; fails.
if (y0 < 0) y0=0; sets y0 = 0`.

Then the code hits int32_t max=(x1-x0)*(y1-y0)
x1 is 5, x0 is 0, so x1 - x0 is 5
y1 is -5, y0 is 0, so y1 - y0 is -5
Then of course 5 * -5 is -25, so max is -25.

Then if( max ) would evaluate to true since if( max ) is equivalent to if( max != 0 ).
And then while(max--) would run for 4294967271 iterations, which suffice to say is a bad thing.

I think the simplest solution would be to make if( max ) become if (max > 0).

Personally though I’d rather exit early with if (y1 < 0 || x1 < 0) return; after the swap and then change the latter part to use unsigned numbers, like so:

void Pokitto::lcdRectangle(int16_t x0, int16_t y0, int16_t x1, int16_t y1, uint16_t color) {
	int16_t temp;
	if (x0 > x1) { temp = x0; x0 = x1; x1 = temp; }
	if (y0 > y1) { temp = y0; y0 = y1; y1 = temp; }
	
	if (y1 < 0 || x1 < 0) return;
	if (x0 > POK_LCD_W) return;
	if (y0 > POK_LCD_H) return;
	
	if (x0 < 0) x0 = 0;
	if (y0 < 0) y0 = 0;
	if (x1 > POK_LCD_W) x1 = POK_LCD_W;
	if (y1 > POK_LCD_H) y1 = POK_LCD_H;

	uint8_t ux0 = static_cast<uint8_t>(x0);
	uint8_t ux1 = static_cast<uint8_t>(x1);
	uint8_t uy0 = static_cast<uint8_t>(y0);
	uint8_t uy1 = static_cast<uint8_t>(y1);
	
	uint32_t max = (ux1 - ux0) * (uy1 - uy0);
	
	if (max > 0) {
		setWindow(uy0, ux0, uy1 - 1, ux1 - 1);
	
		write_command(0x22);    
		setup_data_16(color); // setup the data (flat color = no change between pixels)
		CLR_CS_SET_CD_RD_WR; // go to vram write mode
		for(uint32_t i = max; i > 0; --i) {
			CLR_WR;SET_WR; //toggle writeline, pokitto screen writes a column up to down
		}
		
		setWindow(0, 0, POK_LCD_W - 1, POK_LCD_H - 1);
	}
}

Of course there’s still the slight issue that this technically isn’t necessarily setting setWindow to what it previously was, but that’s a problem with other code.

Also this would have to be changed if the Pokitto ever got a significantly larger screen.


#8

Sorry, I should’ve explained better: These are things I’ve stumbled across while looking for bugs in the emulator and I haven’t tested on hardware yet. The loop was based on the one in the clear screen function, so there shouldn’t be a problem.
I was mostly worried that the upper bounds were like that intentionally and that I’d be breaking something else, so the suggestions are along the lines of “shouldn’t it be something like this?” instead of a proper PR.

Why the cast to uint8? Personally I’d switch everything to int32.


#9

In that case yes, we’d need @jonne to confirm that.

Because setWindow expects uint8_t so there has to be a narrowing conversion anyway.
Earlier code restricts the values to be in the uint8_t range, so it should be safe.

Though the max line might need some extra casts to be safe,
I sometimes forget what the promotion rules are.

```
uint32_t max = static_cast<uint32_t>(ux1 - ux0) * static_cast<uint32_t>(uy1 - uy0);

Edit: I double checked and [it seems this isn't necessary](https://stackoverflow.com/a/6770275).
Arithmetic operators cause 'usual arithmetic conversions' which mean anything smaller than `int` is converted to `int` for calculation purposes.
(Evidently I've spent too long using an 8-bit AVR processor. `:P`)

<s>`uint32_t` may work better, I'd have to double check.</s>
Edit: I checked. The code produced is the same size exactly so the compiler doesn't care, it most likely optimises to the best case either way.

I haven't actually got `lcdRectangle` working on hardware either way yet.
I believe there need to be a few things in place before the lcd- functions work

[quote="FManga, post:8, topic:1020"]
Personally I'd switch everything to int32.
[/quote]

Depends whether we're allowed to break the public facing API or not.

I'd certainly err on the side of `uint32_t` in the latter parts either way because I find unsigned operations sometimes produce smaller code (e.g. `> 0` only has to check the zero bit instead of the zero bit and the sign bit), and they're certainly easier to contemplate because they have less rules to worry about (like the `5 * -5` case).

Really I'd prefer an `x,y,w,h` function to an `x0,y0,x1,y1` function.
It would even be possible to forward the latter to the former.

#10

Letting the call do the narrowing sounds safer, to me. If the function signature is ever changed to int32, there would be no surprises.
On the other hand, I’ve been writing/reading too much javascript and assembly lately and the weird mix is affecting my C++ (the bool is a lie!)

I’d be surprised if sign/unsigned checks makes a size difference on ARM.
One is a CMP+BGT and the other is CMP+BHI.


#11

Perhaps, but only if it changes.
I’m used to public APIs being fairly set in stone, or overloads being offered rather than deprecating overloads that still work, so I’d rather wait until the verdict is out on which practice the Pokitto lib opts for.

I really have spent too long working with AVR chips.
I’m not used to having all these luxuries.


#12

No, other than the fact that the ancestor of PokittoLib ran on an atmega328

You’re 100% right about the 32-bitness

Fortunately, moving from int/uint16 to 32 should be fairly easy to test/fix.


#13

Ah, I didn’t know about that. Makes sense now. :slight_smile:


#14

One possible advantage came to mind when using 16-bit parameters or other variables. The optimizer could use a 32-bit register to store two 16-bit values.(or 4 8-bit values) and avoid using ram . Without a data cache in the processor it is quite costly to read/write to ram. Thought, I do not have any evidence that optimizer actually would do it…


#15

I’m not sure how well that would work with the ARM ISA.

It would depend if it has a means of extracting the high and low 2-bytes of a word without using shifting and masking.

(I’m fairly certain most of its operations only affect the full 32 bits, but I haven’t looked into it in depth so I could be wrong.)

Either way, seems like a very specific compiler optimisation, so I expect it isn’t really a ‘done thing’ and you’d have to use inline assembly to achieve it.


#16

Doing this would break the ABI, which I haven’t seen the compiler do (not saying it doesn’t happen, I haven’t payed attention).

This doesn’t seem to be true for the Cortex M0+ line. The CPU and SRAM run at the same clock speed, so reads/writes only take a cycle (ish), iirc. Besides, if there are few parameters, the ABI passes them all in registers, not in RAM.

Off the top of my head, a PUSH/POP only takes 1 cycle per register in the list, so it doesn’t really make that much of a difference (this is a really nice feature of the ARM ISA, a single PUSH/POP for all the registers at once).
Also, the Cortex M0+ only has Thumb2, not the full ARM IS… so no PKHBT (PacK Halfword Bottom+Top). At first glance, packing (UXTH, UXTH, LSL, ADD) + unpacking (AND, LSR) sounds like it’d be far more costly than simply PUSH/POP.


#17

Cool. So we basically have 32 kb of data cache, but that is also all ram we have ;-)[quote=“FManga, post:16, topic:1020”]
Also, the Cortex M0+ only has Thumb2, not the full ARM IS… so no PKHBT (PacK Halfword Bottom+Top). At first glance, packing (UXTH, UXTH, LSL, ADD) + unpacking (AND, LSR) sounds like it’d be far more costly than simply PUSH/POP.
[/quote]

Ok, looks like there would be no gain using smaller than native data types.

I just got an impression in my tests that using the scanline buffer uint16_t scanline[2][88] is faster than using uint32_t scanline[2][88] in the function: Pokitto::lcdRefreshMode2(). I have to recheck that and look in the assembly what is causing the performance difference.


#18

Odd… is the code online somewhere?


#19

Not yet, but I can send it to you today.