Scroller in 3D

Hanski · June 26, 2018, 9:32pm

Turning the scroller code to 3D at about 39 fps.

jonne · June 26, 2018, 10:01pm

Cool stuff!!!

Pharap · June 26, 2018, 11:19pm

Floats or fixed points?

spinal · June 27, 2018, 6:38am

Now just add that rotation code and you’ve got mode7

Hanski · June 27, 2018, 6:54am

Exactly

Direct drawing would be faster, but wanted to use buffered mode. So I can mix it with the normal bitmaps and drawing functions.

spinal · June 27, 2018, 7:10am

If it can be kept at at or above 25fps I think it will be fine for games.

Hanski · June 27, 2018, 8:18am

I am using fixed points in /PokittoLib/Pokitto/POKITTO_LIBS/FixMath/ -package

adekto · June 27, 2018, 8:39am

so f-zero clone or floors for raycasted first person shooter?
would love to have this for rpg overworld would not even need rotation for that one

Pharap · June 27, 2018, 4:25pm

Damn, I was hoping someone was using my FixedPoints library.

I’m guessing you need some of the trigonometry or square root functions though.

Hanski · June 27, 2018, 6:09pm

I could use it if it contained trigonometric functions which I need after I try rotations. Now I am still optimizing the current demo.

Is your implementation optimized for speed?

Pharap · June 27, 2018, 6:33pm

I’ve never tested the speed, it’s more focused genericity and ease of use.

It should be reasonably fast though, most operations are constexpr so if they can be done at compile time they will be, for example, conversion of a floating point literal (e.g. SQ15x16 a = 0.25;)

Given that FixMath is oriented towards a specific size I would think it might be faster,
but looking at the code there seems to be a lot more to each operation than I have in FixedPoints,
so I’d have no idea without testing.

HomineLudens · June 27, 2018, 8:51pm

How it works? Is it handy?

spinal · June 27, 2018, 9:04pm

Just needs a serial link cable hat and you’re half way to a mario kart game!

Pharap · June 27, 2018, 9:49pm

FixMath is written in C so it’s not very pretty/easy to use (e.g. you have to do fix16 c = fix16_add(a, b) instead of being able to do fix16 c = a + b).
Compared to most C libraries though, it’s quite well commented and fairly neatly written.
The code can be seen here.

Hanski · June 28, 2018, 7:45pm

With the same kind of inner loop optimization as in the 2D version, I have now 50 fps

Hanski · June 28, 2018, 7:48pm

Next I will try to implement rotation.

Hanski · July 5, 2018, 11:09am

Rotation finally works with the perspective scaling, but it is still too slow.

However, I just realized that even if the integer addition and multiplication are equally fast in the LPC11U68 chip (1 cycle), the multiplication of fixed point numbers is significantly slower. For that I have to multiply two 32 bit numbers, store the 64-bit result and scale (shift) it back to, correctly rounded, 32-bit number (even more cycles with the “overflow detection”). That is why I have to get rid of fixed point multiplications in the inner loop as much as possible.

FManga · July 5, 2018, 11:37am

Are you sure GCC is actually emitting a MUL? I’m under the impression that it doesn’t know that MULs are just one cycle and avoids them.
Also, try removing the lower word using a cast or a union+struct, instead of a shift:

union {
uint64_t w;
struct{
 uint32_t l, h;
};
} u;
u.w = ...
return u.h;

I’ve seen GCC being silly when shifts are larger than the machine’s word size.

Hanski · July 5, 2018, 11:49am

I am using this (fix16_mul()):

github.com

pokitto/PokittoLib/blob/master/Pokitto/POKITTO_LIBS/FixMath/fix16.c

#include "fix16.h"
#include "int64.h"


/* Subtraction and addition with overflow detection.
 * The versions without overflow detection are inlined in the header.
 */
#ifndef FIXMATH_NO_OVERFLOW
fix16_t fix16_add(fix16_t a, fix16_t b)
{
  // Use unsigned integers because overflow with signed integers is
  // an undefined operation (http://www.airs.com/blog/archives/120).
  uint32_t _a = a, _b = b;
  uint32_t sum = _a + _b;

  // Overflow can only happen if sign of a == sign of b, and then
  // it causes sign of sum != sign of a.
  if (!((_a ^ _b) & 0x80000000) && ((_a ^ sum) & 0x80000000))
    return fix16_overflow;

This file has been truncated. show original

I does also overflow checks and rounding.

I have not looked at assembly at this point. I think I can just use adds in the inner loop (scanline). I might need to check the assembly later when making upper loop optimizations. Thanks for the tip!

Pharap · July 5, 2018, 12:47pm

The overflow checks themselves don’t seem too bad, but branching is almost guaranteed to cause some slowdown so I’d recommend removing that.

The library allows you to define FIXMATH_NO_OVERFLOW and FIXMATH_NO_ROUNDING to disable some of the unnecessary stuff, but it misses the bits in the adding and subtraction that could be scrapped.