Page 1 of 1
8088 CPU
Posted: Sat Sep 24, 2022 2:25 pm
by pgimeno
Does MCL86 have cycle-accurate MUL/DIV emulation? ISTR that MicroCoreLabs said it doesn't.
I've found this thread discussing cycle-accurate versions of these instructions:
https://www.vogons.org/viewtopic.php?t=62817
In particular this post:
https://www.vogons.org/viewtopic.php?p=703576#p703576
The link in the post is obsolete, though, as it's not a permalink. Here's a permalink to the code the poster is talking about:
https://github.com/reenigne/reenigne/bl ... 4741-L4904
Re: 8088 CPU
Posted: Sat Sep 24, 2022 6:20 pm
by MicroCoreLabs
The MCL86 is cycle accurate, but not cycle exact. I'm not sure anyone has achieved this yet in hardware but I believe the MCL86 comes the closest.
For opcodes which complete in a range of clock cycles like MUL and DIV the MCL86 will use a number in the middle of the range.
The real 8086 uses a combination of microcode and hardware for multiply and divide and takes a variable amount of time to complete depending on the operands. It would be very hard to emulate this behavior exactly so I just picked a number in the middle of the range which yielded good results.
Re: 8088 CPU
Posted: Sat Sep 24, 2022 11:19 pm
by pgimeno
MicroCoreLabs wrote: ↑Sat Sep 24, 2022 6:20 pmIt would be very hard to emulate this behavior exactly
The software emulators I linked above do exactly that, and fine details on the exact algorithm and the cycles taken by each step are provided.
Re: 8088 CPU
Posted: Sun Sep 25, 2022 1:08 am
by MicroCoreLabs
It would be very hard to emulate this behavior exactly so I just picked a number in the middle of the range which yielded good results.
This was my full statement.
Practically anything can be done in a software emulator so there is no surprise that cycle-exact emulators exist. Doing so in microcode running in an FPGA is a different story...
Re: 8088 CPU
Posted: Sun Sep 25, 2022 8:11 am
by Malor
If anyone has ever decapped an 8088, traced the schematic, and posted it online, a cycle-exact replication might be possible. Without that, it seems unlikely to me that any designer would be likely to hit upon the exact solution the 8088 used for its hardware circuits versus what it did with microcode.
Software solutions can observe actual behavior and duplicate it using brute force, in essence, using special casing if necessary, but an FPGA designer doesn't really have that option. They have to more or less duplicate the original circuitry, and that's unlikely to ever happen from scratch. Well, unless someone's getting paid to do it. It would be too much of a PITA otherwise.
Because the instructions were so variable in real life, it seems unlikely that anything out there is going to depend on a precise cycle count for multiply and divide instructions. A result that's always in the middle of the range might make benchmarks register the chip as being slightly faster or slower, but actual programs probably won't show any difference.
Re: 8088 CPU
Posted: Sun Sep 25, 2022 10:33 am
by pgimeno
The chip
has been decapped and analysed.
https://forum.vcfed.org/index.php?threa ... bly.77933/
The exact breakout of cycles used by the instruction is documented in the code
and in the thread I posted:
https://www.vogons.org/viewtopic.php?p=703589#p703589
- Non-AAD byte takes 8 cycles.
- Negating takes 1 cycle(IMUL).
- Something with highest multiplier bit takes 1 cycle?
- Only negative source takes four cycles more?
- 10 cycles for GRP opcodes.
- 3 cycles for non-AAD opcodes.
Then the main multiplication loop, for each multiplier bit:
- 7 cycles for the whole shift/addition.
- 1 cycle for each set bit.
Finally, for negated inputs, 9 cycles more.
There are more details in the thread, including a reply to the above.
Without cycle accurate versions of these instructions,
demos that work by counting cycles can easily fail. It's not difficult to foresee a demo using specific mul or div operands to wait for a certain number of clock cycles, and the demo failing when they're incorrect.
This classic Kefrens bars effect was done by reenigne in 320x200x4 mode. It’s a cycle-counting effect, as there is simply no time to monitor for horizontal retrace. To ensure the cycle counting was consistent, several things were done including changing the system default DRAM refresh from it’s default interval of 18 to 19, to get the DRAM refresh periods to line up with CRTC accesses.
Re: 8088 CPU
Posted: Sun Sep 25, 2022 1:34 pm
by kitune-san
If anyone knows more about this, please let me know.
When a word access is executed to an I/O device (out, in), is the address incremented on the second access?
Re: 8088 CPU
Posted: Sun Sep 25, 2022 6:44 pm
by MicroCoreLabs
When a word access is executed to an I/O device (out, in), is the address incremented on the second access?
Yes it does increment. Please see my BIU code on GitHub. Also remember that the address will wrap over the segment.
For example a word read at 0xAFFFF will read from 0xAFFFF and then 0xA0000. It will not read from 0xAFFFF and 0xB0000.
Re: 8088 CPU
Posted: Mon Sep 26, 2022 1:31 pm
by kitune-san
Thank you!