Keep in mind the 32X has no hardware features for 3D graphics, polygons, etc. Nor does it have any dedicated hardware for sprite scaling.Neocaron wrote: ↑Fri Jan 28, 2022 2:25 am 32X was capable of doing 50,000 polygons per second. It should be noted that since the 32X was an add-on for the Genesis, the graphics hardware of the 32X was typically split between the 32-bit VDP (which provided the really fancy stuff. For example: the 3D polygon characters in Virtua Fighter 32X and other 3D elements on screen) and the 16-bit Genesis VDP provided stuff like the backgrounds and some minor stuff (depending on the game)
Saturn was capable of doing 200,000 texture mapped polygons per second OR 500,000 flat shaded polygons per second. Since the Saturn was it's own system (and not an add-on for the Genesis) it had two 32 bit VDPs (one for characters and other main stuff, and the other for backgrounds).
It's all software rendered, unlike the Saturn.
The 32X games can vary a lot in how they setup the hardware usage, but according to people such as Joseph Fenton (a.k.a. Chilly Willy) (Wolf 32X homebrew port, 3D 32X demos, etc.) and Victor Luchitz (Doom 32X Resurrection) the best approach for maximum 3D performance is to do all graphics on the 32X and use the Mega Drive side as an old PC sound card, handling just music and controller input.
To maximize performance you have to optimize both logic and rendering to be split between the two CPUs.
These are quotes from both guys, which are hands down the most knowledgeable ones when it comes to 32X's hardware and development:
http://gendev.spritesmind.net/forum/vie ... =15#p37344One thing you need for best speed on the SH2 side - keep the MD off the bus. Make sure the Z80 is held reset, the 68000 is running in work ram, and the VDP is not doing any DMA. Running the 68000 main() in rom is enough to maybe halve the speed of the 32X side.
Running sh2 functions from sdram makes them cache quicker. The sdram does burst reads - 8 words in 12 cycles. Nothing else burst reads, meaning the cache is much slower to load. Oh, and make sure you have the cache turned on and set to 4 way set associative. If you're not using the cache (have the cache in scratchpad mode) or you're running with the PC set to uncached pointers to functions, you aren't going to cache the code, which means it won't be fast on real hardware.
This also applies to data in the rom - use cache enabled pointers to data in the rom to cache the data. I'm not sure if this really needs to be said, but using a pointer with the caching bits of the pointer set to uncached space means the data will not cache - it will read from the rom every time it is accessed.
There are times when you WANT to read something as non-cached. For example, a variable set by the other cpu. Then you either need to flush the address of the variable before reading it, or to read it as uncached.
http://gendev.spritesmind.net/forum/vie ... =15#p37332I have a work in progress demo that draws a tilemap and is able to do sprite scaling, flipping and clipping using both CPUs at 60fps, and here what I've learned about doing 2D gfx on the 32X:
1) always keep your drawing code in SDRAM
2) unroll your drawing loops as much as you can, use Duff's device
3) try different optimization settings: generally -Os works better, but also try -O2 to see if that improves performance
4) keep as much of your tile data in SDRAM - drawing from ROM is extremely slow
5) use both CPUs for drawing
6) write longs to VRAM to maximize throughput if you can, otherwise - use word writes, don't ever write single bytes unless you absolutely have to
7) the overwrite area is your friend if you want to do transparency - use it
8 ) use the line table to scroll the screen area both vertically and horizontally
9) use the shift register if you plan to have smooth horizontal scrolling by an odd amount of pixels, but:
10) the shift register is bugged and you can't infinitely scroll the screen by manipulating the line table without running into glitches, in fact you can barely scroll at all
11) unfortunately it's impossible to have free infinite horizontal scrolling on the 32X - you have to periodically re-draw the whole screen to reset the line table due to pt10
12) the 32X is only equipped to do 2 layers of 2D gfx at 50FPS at 320x224 if you use one CPU: 320*224*5*50 = 8.96 M cycles per second is the total amount of cycles it takes to redraw the whole screen on PAL a 32X at 50FPS, which slightly less than a half of the budget of 22.8M cycles you've got. Bare in mind this doesn't even account for cycles spent on reading the tile/sprite data and game logic, so at best you can hope to do only 2 layers even if you use _both_ CPUs
13) save CPU cycles as much as you can: avoid re-drawing between frames and overdrawing within frames as much as possible, don't wait in a hot loop for framebuffers to have completed swapping
14) avoid clearing the whole screen each frame, use the "dirty rectangles" technique to only clear stuff which needs to be cleared
15) don't use the hardware filler, it's useless
16) accessing the MD VRAM from 68k is SLOOOOOW and basically halts the SH-2's for the duration of the access
17) always test of real hardware: typically performance on emulators is 1.5 to 2 times better than on real hw
18) always read your sprite and tile data in forward direction to benefit from burst access to SDRAM
19) modern compilers are pretty good at producing optimized assembly, don't waste your time on hand-optimized code
The demo he refers too is here: https://github.com/viciious/yatssd
Someone mentioned the 32X being finnicky and I totally agree. It's not uncommon to be using it and then the slightest touch on the Mega Drive can make your game freeze.
The fact that it sits on the cart slot and most of its weight is up top certainly doesn't help. Having it on MiSTer FPGA would be far more practical.
That would be awesome!This was discussed earlier in the thread, and elsewhere, but the general assumption is that srg320 will move on to the 32x after finishing Saturn and doing some needed improvements to the Mega Drive core. The Saturn core and Mega Drive fixes pave the way for the 32x. There is a reasonable chance we could have it by the end of the year.
Thanks for the input.