Direct Memory Access Discussion

A discussion about the possibility of using DMA to improve the Pokitto’s drawing speed.

Migrated from here:

@adekto Guess what? We can do that with DMA. Yeeees…


So it can be even faster?

Well, the point of DMA is that it does not consume the CPU to do memory-to-memory transactions.

Potentially yes.

EDIT: and I have done this already actually, just on a test basis

EDIT2: we are nowhere near the performance we eventually can get out of the Pokitto


If dma works that could alocate more memory on CPU
Thought wouldent that means some sort of mask or something since I think you need to invert drawing (top Sprite first)

1 Like

Someone will have to explain the theory to me here. The only experience I have of DMA is an old gfx card in the 90’s that would randomly leave magenta squares on my screen where video was meant to be…

Direct memory access gives a peripheral device access to main memory (i.e. RAM).
So basically the device can pull data in as it pleases instead of relying on the CPU to push data to it.

There is a downside though, it can lead to data-races in some cases.

Essentially the cpu could start writing the next frame while the screen is still reading the old one, and if the cpu overtakes the screen then the screen ends up displaying part of one frame and part of the next
(That’s possibly why you got the magenta squares back in the day.)

That’s also what sometimes leads to tearing even in modern graphics cards, and why vsync and double/triple buffering are used.

That can be circumvented with a decent concurrency algorithm/mechanism.

im not sure how this is going to help us it still has to grap data from cpu memory to display ram

That’s why its nice to be able to double buffer the screen buffer. You write to one buffer, then start the DMA controller to write that buffer to the display. The CPU can then start writing to the second buffer while DMA is updating the display from the first. You can start DMA on the second buffer as soon as the previous DMA and the rendering of the second buffer is complete, or when it’s time to display the next frame. You alternate this way between both buffers.

This means the CPU doesn’t loose available render time by having to spend cycles updating the display on each frame.

you dont know when the display is finished and swapping mid draw leads to screen tearing
also 2 buffers seems like unnecessary amount of memory use

Usually the DMA controller causes an interrupt when it has finished writing all the bytes it was set up to do, or the CPU can test a flag in a register that indicates DMA complete . That’s why I said you wait for either the previous DMA to complete or the new render or the desired frame time (whichever is the longest) before starting a new DMA cycle on the new buffer.

Assuming the display pulls super fast we could do a tiny timeout
Or maybe let cpu focus on sound output for that amount

Yes, even with single buffering you can still do anything that doesn’t disturb the display buffer while DMA is happening. Though sound is usually handled by interrupts, so there isn’t much additional foreground time the CPU needs to generate it.

Hang on I think I see a problem the display wants 565 data wen our buffers are compressed pallet indexes due to memory

Going from 4bpp to 565 is like a 4x increase in memory

Yes, indeed. There is no ram for full 16 bit buffer. I do not know if doing DMA transfer one scanline at a time would be very efficient (?)

1 Like

Maybe not one scanline at a time, but possibly a section of the screen ( quarter).

It’s all a matter of balancing the timing with the amount of memory available.

There may even be features of the screen and/or screen commands that can help.

1 Like

having these sections will surely cause screen tearing in very obvious chunks,
the cpu still has to do allot of work expanding the indexed sprites to 16bits. i asume the display is going to pull faster then the cpu
im just going to stick with the current mode15 unless someone can show me a more efficient way