# Directmode Test

#22

Hi, @Pharap. When I started this thread, I only wanted to do some simple tests, but then I realized I could go a bit further and test what could be done without relying on a full framebuffer. IMHO they are still very basic tests, maybe code snippets at most.
I don’t feel confortable licensing while keeping some of the “placeholder” images used (those found in HAM8 loader, rotozoomer). I just needed some colourful pictures for a quick&dirty Proof of Concept. Has somebody got some good replacements for these pics?
Btw, is your profile pic licensed? I would like to use it in another test, if you don’t mind. Maybe it could also be used in a short spooky game…
About Math.abs(): you’re right, I suppose I tried it without pointing the class where it belongs (Math). I’m not used to typing that much to code simple things. I wrote some helper functions at the end. Most of them are never used and some of them may contain bugs.

#23

Next: Moiré (interference patterns). Wikipedia: Moiré Pattern

Directmode Moire.bin (167.1 KB)

Directmode Moire.zip (468.0 KB)

#24

GitHub’s ‘choose a licence’ website is quite a good starting place.

They’ve got an article about what not having a licence means and why having a licence is important:

As well as a list of commonly used licences, and a more complete list of licences.
Along with the licence text they’ve got the key implications of the licences bullet pointed.

Speaking of GitHub, if you haven’t already joined GitHub (or maybe GitLab) I recommend considering joining one of them (or both if you really want to).

I don’t know about GitLab but GitHub makes it easy to specify a licence for your code.

I prefer Apache 2.0, but I also like MIT and BSD 3-clause.
I have no strong feelings about CC0 but wouldn’t really recommend it as such.
I would recommend against the GPL.

The most commonly used ones for games on consoles like the Pokitto are:

• MIT
• BSD 3-clause
• Apache 2.0
• CC0
• GNU GPL 3.0
A simple (and somewhat rough) summary of what each licence allows and requires
• MIT
• Requires the user to retain the licence info and any copyright notices
• Does not require the user to do anything else (which includes disclosing their source code)
• BSD 3-clause
• Pretty much the same as MIT, but explicitly requires that the name of the copyright holder and any contributors to the original code must not be used to endorse derivitives without prior written permission. (This is probably implied by most licences anyway, but BSD-3 makes it explicit.)
• Apache 2.0
• Pretty much the same as MIT, but it requires that any modified versions of the software include prominent notices stating that the code was modified and by whom. (Specifically “You must cause any modified files to carry prominent notices stating that You changed the files” - so it’s a fairly loose restriction.)
• CC0
• As close to the public domain as legally possible
• Doesn’t even require the user to give credit
• GNU GPL 3.0
• Requires the user to retain the licence info and any copyright notices
• Requires the user to provide notices stating that they have modified the code including a date for when the code was modified. (Specifically “The work must carry prominent notices stating that you modified it, and giving a relevant date”.)
• Requires that any modified versions of the code use the same licence
• Requires that if the program is published anywhere then the source code must be publicly accessible (in one or more ways specified by the licence)

(‘User’ in this sense means someone wanting to copy or modify the code.)

All of them waive any warranty or liability - that’s standard for pretty much any licence (unless you’re actually providing warranties, but those are usually sold by companies).

Personally I almost always go for either Apache 2.0 or MIT, depending on my intent.

Usually I prefer Apache 2.0 because I want anyone who modifies the code to leave a notice stating that they modified it.
I like this partly because it helps differentiate between an unchanged copy of the original code and a copy that someone has modified, but also because it encourages people to take credit for their modifications.

Personally I tend to only use MIT when I don’t care too much about what happens to some code afterwards, which is quite rare. Other people just use MIT all the time because it’s the most hassle-free licence that still gives them credit.
BSD 3-clause is mostly equivalent to MIT, so it’s pretty much the same situation.

CC0 I don’t have any particularly strong feelings for or against. I’ll happily use CC0 code written by other people if I have the need to do so, but I’ve never used it myself. Partly because I don’t like the idea of giving up credit, partly because I don’t like that it was written as a ‘general’ licence rather than being software specific.

GPL v3 I avoid whenever possible. I dislike its ‘viral’ nature - if you use even one tiny function from a piece of GPL code, you have to release your whole program under the GPL.
(Technically called ‘copyleft’.)

None of the other four licences I’ve discussed have that restriction. For example, you could take a single function out of some MIT code, dump it somewhere within your code and as long as you had the right copyright notices in place and it was clear that function was licenced as MIT, you would be free to licence the rest of your code however you pleased (even closed source if you really wanted to).

Some people like that copyleft aspect precisely because of its viral nature - they want to spread ‘free software’ as much as possible. I don’t like that - I want to give my users the freedom to make their own judgements about how they distribute their code, I don’t want to make that decision for them.

Tangents happen.

We probably should have one big ‘choosing a licence’ thread,
but I’d be worried it might derail into arguments (or at least really long debates over which licences are superior).

You could have a quick look over at Open Game Art to see if you can find a suitable replacement:
https://opengameart.org/

As far as I’m aware all the art there has specific licences.
(Art licences, not code licences, but they’re mostly all Creative Commons licences, which are really easy to understand.)

The majority are either CC0 or CC-BY.
CC0 is basically “do what you like”/“public domain”,
and CC-BY is “do what you like, but you must credit the original author”.

I’m afraid it’s one thing I’d like to keep copyright on since I don’t want copycats popping up and causing trouble.

(I’m not implying you would, but if I licenced it in a way that it could be publicly used then I wouldn’t have a leg to stand on if someone else started using it as their icon.)

I could offer you some other sprites if you’ve got something specific in mind though.

Though if you specifically want a skull, there’s a few decent ones available on that site I linked to before:

(CC-BY-SA)

(CC0)

Though I suppose it depends what kind of size sprite you want.
A lot of the sprites I’ve made (though I’m not much of an artist) are either 8x8, 16x16 or 32x32.
I’ve got a few generic skulls lying around.

Technically speaking the abs contains a bug, but the chance of it actually affecting anything is vanishingly small.
Basically if you put -0.0 in you’ll get -0.0 out,
and the output of abs is of course never supposed to be negative.

But 99% of the time that won’t matter because adding anything to zero ignores the sign of zero and multiplying anything with positive or negative zero gives you back the same zero (the ‘multiplicative identitiy’ property).
The only time I can think that would matter is if you tried to copy the sign,
but I don’t think even regular Java provides that facility.
C++ does though (through std::copysign).

Blimey, what language do you normally use?

Not that Java’s particularly terse, but that’s far from what I’d consider a lot of typing.

#25

#26

Next: 2d Metaballs Wikipedia: Metaballs. This is one of my all-time favourite old-school demo effects
The color gradients were picked up in a hurry and make the effect not looking good.

Directmode Metaballs.bin (172.7 KB)

@FManga: is there an easy way to disable out-of-bounds checking for arrays? I’m afraid it’s killing the framerate. It’s good for debugging but not for high performance release.

#27

Your demos are really amazing! The only problem is that fps is a bit slow.

#28

I don’t remember if it’s in the current release or still in github only, but compiling in release mode (Ctrl+B or Ctrl+G) has no bounds checking. Try cloning the repository into the IDE’s directory.

#29

@Hanski: thank you!. They have been coded in a hurry, not sure why I need to move fast. I hope they run much faster without out-of-bounds checking enabled. Either that or switching to C++ with old good arrays. Btw, the things you have done with the Pokitto are really awesome.

@FManga: thank you, I’ll try cloning over.

There are still lots of cool demo effects awaiting their turn…

#30

By the way, are these with overclocking on or off?

#31

Off. Java’s overclocking support will come in the next release.

#32

@FManga: I cloned repository (to a different folder, just in case it doesn’t work), rebuilt, got some weird errors about idiv, panicked!, winmerged folders (version 0.1.0 “stable” vs cloned one), found lots of changes, panicked even more!, entered “I don’t know what I’m doing but let’s comment out some suspicious chuncks of code and see what it happens”:

At javacompiler\pokitto\begin.cpp:

/*
extern "C" unsigned __aeabi_uidiv(unsigned numerator, unsigned denominator){
return __UDIV__(numerator, denominator);
}

extern "C" signed __aeabi_idiv(signed numerator, signed denominator){
return __IDIV__(numerator, denominator);
}
*/


I had a lot of luck and it worked! These are the new “faster” binaries:

Directmode Test 1.bin (36.5 KB)

Directmode Test 2.bin (40.5 KB)

Directmode Test 3.bin (49.2 KB)

Directmode Ham8.bin (224.7 KB)

Directmode BitmapRotation.bin (162.8 KB)

Directmode Plasma.bin (35.7 KB)

Directmode Moire.bin (165.4 KB)

Directmode Metaballs.bin (169.2 KB)

Now most binaries run at about twice speed with no array bounds checking, reaching ~20fps in emulator. I think that’s more than good enough for a Proof of Concept kind of code.

Possible speed and/or size improvements: porting to C++, optimizing drawScanline() function, overclocking, shrinking look up tables, optimizing algorithms used and inner loops, reducing screen area, doubling pixels, …, last resort: traslating time critical sections to ARM ASM.

Long story

After getting it compiling, I thought: ok, about x2 speed improvement, but still far from 30fps… let’s optimize it a bit to get there. I spent some minutes hacking and making even uglier code… when suddenly I realized: what I’m doing here? I’m optimizing small brute-force chunks of Java code written in a hurry that get translated into C++ code and compiled by a very complex tool I don’t understand that takes about 1 Gbyte on my hard drive… the result is a binary smaller than 200kb that has to run max speed on a sw emulator of a hw system with only 36kb RAM available. That moment, my poor brain got a short-circuit and halted.

#33

Why would that make a difference?

(If you’re thinking “CPU cache” - there is no cache.)

#34

@Pharap: Ok, that’s size optimization (remember I said possible speed and/or size improvements)… IMHO, size optimizations go to the same bag of “optimizations”. A full “demo” is comprised of many different effects. I’m using almost all available flash memory (~200kb) in just one or two effects. There are things like symmetry that can be exploited to get smaller mem footprint in look up tables. Obviously, I could have zero flash memory used for LUTs if I just had more RAM available, because I could really compute them on the fly while you watch a short fake “loading” screen, but I’m afraid 32kb RAM is too tight for serious look up tables (and you also have to add runtime stack, dynamic memory, etc.).
I don’t really want to ask this: is there a bootloader API function to load another binary into flash and switch execution to it? I know, I know, that’s not good for flash memory as it shortens its lifespan in one cycle everytime you load something into flash.

#35

Glad you managed to figure that out!
Those two functions you commented actually got removed in a later commit. In testing they performed worse than expected so I dropped them.
As you can see, there were lots of changes under the hood, many of which were performance-oriented. One thing that will probably help the most is writing drawScanline in thumb asm. The compiler accepts __inline_asm__ for this kind of thing, I still need to document how to use it.

You might want to join the discord channel. It’s faster to get help there when I break stuff.

Another thing that will help is the CPU Profiler that will be available on the next release:

#36

@FManga: excellent! It looks really good! What you’re doing is serious highest level wizardry.
Aaahhh, inline asm inside Java , I can’t resist temptation to use it… but I must be strong…

#37

Lot of laugh!

#38

Update the IDE again and try this:

drawScanline...
    public static void drawScanline( int x, int y, int w, short[] s, int xs, int xf ) {
if( xs < 0 ) {
x -= xs;
w += xs;
xs = 0;
}

if( x < 0 ) {
xs -= x;
w += x;
x = 0;
}

if( xs + w >= xf ) {
w = xf - xs;
}

if( x + w >= 220 )
w = 220 - x;

if( y < 0 || y >= 176 || x >= 220 || w <= 0 )
return;

ST7775.setX(x);
ST7775.setY(y);
ST7775.beginStream();
/* */
int tmp = 0;
int WRBit = 1<<12;
int CLR = 252;
pointer LCD = ST7775.MPIN;
pointer buffer;
__inline_cpp__("buffer = s->elements");

__inline_asm__("
@output tmp:+l
@output buffer:+l
@output w:+l
@input LCD:l
@input WRBit:l
@input CLR:l
@clobber cc

ldrh @tmp, [@buffer]
loop%=:
lsls @tmp, 3
str  @tmp, [@LCD]
str  @WRBit, [@LCD, @CLR]
ldrh @tmp, [@buffer]
subs @w, 1
str  @WRBit, [@LCD, 124]
bne loop%=
");
/*/
while( w-- > 0 ) {
ST7775.writeData( s[xs++] );
}
ST7775.setX(0);
ST7775.setY(0);
/* */
}



#39

@FManga: WOW, flawless!!! now running at ~30fps with 2 metaballs on screen without even touching anything from the heavy inner loop.

Directmode Metaballs.bin (169.4 KB)

Directmode Metaballs v1.zip (96.5 KB)

Meanwhile, I’m still messing with the repo, the license and the likes.

#40

Next: effects based on Polar coordinate system
Neat math trick that produces nice tunnels and spirals very easy, specially when combined with texture mapping or color gradients.
This time, look up tables (~20kb) are calculated on the fly and stored in RAM. Garbage Collector seems to work pretty well as I’m inefficiently freeing/allocating memory and recalculating every time a effect switch happens.

Directmode Polar.bin (38.8 KB)