[Tool]Gameboy Emulator

FPS counting (with unlimited frametrate).

Why? A cycle is a cycle. On my i5 laptop I get about 1300 FPS running Marioland. 2500 FPS if disabling video output. Whatever change I do to the code - if it increases the FPS count, it’s good.

About half of the time zBoy spends outputting pixels to screen. That’s what gprof says, and it’s actually confirmed by FPS measures with/without video output, which is why I disable video output when trying to measure the impact of changes on the CPU/MMU modules.

Anyway, my branchless MMU version is running at exactly the same speed on my i5… And when running on my PII 300 MHz, it’s even slower than the branch-rich original (110 FPS with branching, 100 with function pointers calls replacing if/else chains). This is the reason I was surprised you got different results - but I understand you did not have that much speed-up either, and the little you got is due to the specificity of Cortex CPUs (no cache to trash, no branch prediction to miss).

Ultimately, after the new tests I did during last 48h, I have doubts that fiddling with the MMU can bring any significant improvements, on x86 at least. Nonetheless, I will continue looking for ways to make things faster.

True, inlining (within reasonable limits to avoid cache trashing) can bring very good performance results. But it troubles me that you mention it, while at the same time you do function calls through pointers (through the writeHandlers[] array), since such calls cannot be inlined, and moreover are likely to end up performing long jumps (much slower than short jumps, at least on x86). I am sure that this method is somehow beneficial on the Corex arch, since you applied it again in CPU emulation through the OP[] function ptr array. Unfortunately it seems to be an optimization that works only for this kind of MCUs.

I also noticed that you perform more CPU instructions within a single emulation cycle (8 instructions, to be precise). While this certainly makes things many times faster overall because the rest of the emulation modules need to be called 8x times less often, I worry about the emulation accuracy… Didn’t it lead to regressions? I’d expect at least some games (esp. those with very tight timing loops) to get broken, it could also make some games being notified of a H-blank too late, and generate possible glitches. I’d be curious to know how often such troubles really occur with your approach (if at all).

If a >1ghz processor has trouble emulating a 4mhz one, it’s likely that a micro-optimization involving cycle counting wouldn’t be significant and there are high-level optimizations to be made instead. Diminishing returns.
It seems the machines you’re benchmarking on don’t need optimization anyway, they already run much faster than they need to, unless you’re trying to prolong laptop battery life.

They have the same cost here. Are they still that much slower on x86? I haven’t benchmarked x86 code in a long time.

8 instructions seems to be the upper limit, more than that will start breaking things, IIRC. Yes, it’s trading accuracy for performance, but, given the restrictions, the priorities are a bit different. I’d rather have a good amount of playable games than have a whole lot of really slow ones. Since I can’t test them all, I have some games I test with, then I find out if I’ve gone too far by releasing it and letting others use it.

There is a bug regarding some games not accepting input. I guess it could be related to this, I’ll have to check.

3 Likes

Battery life is a possible concern, yes, but more generally - anything that wins some extra FPSes on my i5 is usually just as much beneficial on slower machines. I test zBoy periodically on my PII PC, the performance percentage gain is usually similar on both machine. Now of course both are x86 architectures. Results on different things (like the Pokitto) may vary wildly. For instance, on both my x86 machines, 50% of CPU time is spent drawing pixels, even though the video backends are completely different (SDL2 vs VGA mode 13h). But apparently on the Pokitto this activity is negligible, since I noticed you commented out my “detect screen areas that need to be refreshed” code in the Pokitto build (hence I assume it was costing more CPU cycles than it was saving, so pixels are probably almost free there).

Of course they are, I don’t question this choice at all. I’m only sad that I can’t reuse your clever bits upstream, since they are either Pokitto-specific or implying possible regressions :slight_smile:

Possibly. But it could also be due to a zBoy bug… I fixed a major joyad emulation bug in r211 the other day. This fixed controls at least for Galaga & Galaxians (it’s the one game I noticed the problem in the first place, the day after I released v0.70). It’s a miracle that almost every title worked at all until now.

2 Likes

This was mostly a RAM optimization. Instead of storing a full framebuffer, I switched to a scanline buffer. I wrote the code that copies the scanline to the LCD in assembly so it wouldn’t become a bottleneck.

Ah, I’ll have to check that fix then. My dad sent me a bug report about those two games. :stuck_out_tongue:

Did you count the GameBoy Color ROMs that also run on the DMG? Okay, there might not be a lot of such ROMs within Pokitto’s 128K limit :slight_smile: Roadsters '98 do come to mind, though (and there are probably a few more)

1 Like

Awesome. Played Super Mario Land on the Pokitto today.
The latest Firmware of ZBoy (0.7) got sound. Is it possible to implement this version?
With sound everything is better :wink:

1 Like

It should be possible, and I intend to do so, but I’ve got some other things I need to finish first.
:coffee::coffee::coffee: :sleepy:

7 Likes

Hmmmm… Does the emulator work with GB Studio games? :slight_smile:

GBStudio is based on the GBDK, which is what was used for the other opensource GB games… so… maybe? Have to try and see. The only GB Studio game I could find a ROM for was too big to fit in the Pokitto.

1 Like

Yeah, I just made ROM of a blank project and it was 512 KB … It’s a nice tool, but the result seems be too “bloated” for the Pokitto emulator. :wink:

I was wondering if there have been any updates recently?

1 Like

I just rolled back the last release, as it was breaking games like Tetris.
Work has been underway towards freeing up space so I can pull in the audio support from upstream, but there is still a lot to do.

6 Likes

Sounds great! Please let me know before you pull upstream stuff into Pokitto - I fixed lots of sound-related issues, so I will release a new version so you can benefit from it.

7 Likes

Any new release at 72 mhz ?

1 Like

@FManga was one of the crafty ones who spotted the trickery in the LPC startup files

As a result, the GB emu has been running overclocked already for some time :wink:

2 Likes

Hey. Any zBoy updates…sound ^^

Thx

Hey, sorry, no updates on this. All of the attention is going to the Java Jam, at the moment.

5 Likes

hello, folks. this short message just to let you know that today I published zBoy v0.71. This new version comes with a few optimizations, improved sound support and one joypad-emulation bugfix that was making Galaxians unplayable. http://zboy.sourceforge.net

15 Likes

@FManga Maybe you could update your page with the zBoy v0.71? :wink: :relaxed:

1 Like

it’s not as easy,
some changes have be done on the pokitto edition to made it possible,
also zboy is released by version, not any commit history to easily see or backport each change.

if you want fresher things you may have a look at my own fork of FManga zBoy revision who ship a full rewrite of the gb gpu, or my port of gnuboy.
but yet twice will require you to set up a compilation ecosystem, for now not any web backend to morph your roms into exe, and all is frozen until an hardware fix and some free time.

1 Like