Page 1 of 2
IDE Controller Errors in Linux on AO486
Posted: Mon Jan 17, 2022 2:43 am
by sajattack
I created a gentoo linux image
https://nextcloud.paulsajna.com/index.p ... pCCH3cBFMo for 486 without fpu, and it boots the kernel, but fails to mount the rootfs due to some IDE errors. Same image boots fine in qemu with command
Code: Select all
qemu-system-i386 -hda gentoo.vhd -cpu 486,-fpu
I'm starting to think there is a bug or omission in the IDE controller for ao486 on MiSTer. Can anyone help me narrow it down?
- 20220115_180232-screen.png (95.95 KiB) Viewed 14909 times
- 20220115_182555-screen.png (113.67 KiB) Viewed 14909 times
Re: IDE Controller Errors in Linux on AO486
Posted: Mon Jan 17, 2022 6:06 am
by Bas
This or similar happens with Windows NT, OS/2 and the BSD's as well. The IDE controller appears to be quite minimal and just good enough for DOS.
Re: IDE Controller Errors in Linux on AO486
Posted: Tue Jan 25, 2022 12:41 am
by user7182
I can't offer any technical advice here, I'm not familiar with IDE at all. I ran your image and looked at the logs which might help if your trying to narrow it down.
Gentoo is sending a READ MULTIPLE command (0xC4), and the IDE controller is returning an abort (seen in your screen shots). This MiSTer log confirms this:
Code: Select all
IDE regs:
io_done: 01
features: 00
sec_cnt: 01
sector: 00
cylinder: 0000
head: 00
drv: 00
lba: 01
command: C4
IDE command: C4 (on 0)
(!) Read multiple is disabled!
That command is supported, you can see it handled:
https://github.com/MiSTer-devel/Main_Mi ... e.cpp#L887
But it returns an error if .spb isn't set (not sure what that is, this where knowing something about IDE controller would help), and that's only set on a SET MULTIPLE command (0xC6). Should Gentoo have sent a SET MULTIPLE command? What does it do?
Re: IDE Controller Errors in Linux on AO486
Posted: Wed Feb 09, 2022 8:04 pm
by alexoughton
@sajattack
I'm experimenting with a patch to the IDE code which I'm hoping may solve this issue. I'm trying to download your image though, and the page is not working (gives a "Nextcloud" error). Can you assist with getting access to the image please?
"SPB" is "Sectors Per Block". Some of the documentation I'm reading online implies there should be a default value for this which applies whenever SET MUTLIPLE has not been used to override it. My experiment is to set this to the same value as word 59 of IDENTIFY ("multiple sectors"), which is 0x110.
Re: IDE Controller Errors in Linux on AO486
Posted: Thu Feb 10, 2022 9:54 am
by macro
quick google ...
A block, on the other hand, is a group of sectors that the operating system can address (point to). A block might be one sector, or it might be several sectors (2,4,8, or even 16). The bigger the drive, the more sectors that a block will hold.
so if the core returns 1 sector at a time, then I guess it should be set to 1
Re: IDE Controller Errors in Linux on AO486
Posted: Thu Feb 10, 2022 9:58 am
by toastboy
Re: IDE Controller Errors in Linux on AO486
Posted: Thu Feb 10, 2022 1:38 pm
by alexoughton
Thanks both. The original download link started working again, so I downloaded that successfully.
My patch (set to 0x110) does allow the kernel to get further into the boot process, but it then fails on some other disk related errors. I'm going to keep playing with it and see if I can get any further than this.
Regarding setting it to "1", yes I've considered that too. Looking in the IDE code though, it does look like multiple-sector reads are supposed to be working, which is why I've started with a derived default rather than just short-circuiting to 1 sector. I will test with "1" next and see if it is a quick way to get further along.
Regarding the log output which @user7182 is showing, where can I find this? Looking around on SSH I'm not seeing anything obvious for a location of log files. Never mind. I got the log through the UART port.
Update: Setting spb to "1" or rewriting the code to just do a non-multi read if spb is not set both actually make things worse (the failure messages from the original report return). The boot process gets furthest along with this set to 0x110. However shortly afterwards the Kernel gets stuck again with the following error:
Code: Select all
ata1: lost interrupt (Status 0x58)
Google suggests that if this were a "real" system it means the drive stopped responding to commands. There's nothing in the MiSTer log at the time this is happening, so I may have to add some more debug messages to the code to find out what's happening here. This will have to wait for later though, as I should go and do some "real work" now!
Update 2: I feel like this is actually going somewhere. Turning on the debug log options I saw some "READ MULTI" happening with a block size of 8. I tried that as the hardset value in the code, and Linux actually
mounted the EXT4 file system! It's still throwing other errors relating to READ MULTI and fails to continue booting shortly after that, but there's definitely progress.
Re: IDE Controller Errors in Linux on AO486
Posted: Thu Feb 10, 2022 11:07 pm
by alexoughton
Oh my goodness, it works. The answer was 16. Linux has booted and has even gone all the way through a successful fsck.
You can try my compiled version of the main MiSTer binary here: REMOVED. Please now use a build from nightlies or an official release after 2/11/2022.
Make sure you back-up the copy on the root of your SD card and then place this copy there instead. All I've done is change line 426 of ide.cpp to set spb to 16 instead of 0. I'll raise this in a formal bug report shortly to see if we can get this properly fixed instead of my hack.
Update: Pull Request has been raised here:
https://github.com/MiSTer-devel/Main_MiSTer/pull/534.
Note that this does not solve the problem with Windows NT booting. I don't even think that's IDE-related, since I get the "inaccessible boot device" crash even when I'm trying to boot from setup floppies.
Re: IDE Controller Errors in Linux on AO486
Posted: Thu Feb 10, 2022 11:45 pm
by wark91
Thank you !
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 2:21 am
by user7182
alexoughton wrote: ↑Thu Feb 10, 2022 11:07 pm
Oh my goodness, it works. The answer was 16. Linux has booted and has even gone all the way through a successful fsck.
Make sure you back-up the copy on the root of your SD card and then place this copy there instead. All I've done is change line 426 of ide.cpp to set spb to 16 instead of 0. I'll raise this in a formal bug report shortly to see if we can get this properly fixed instead of my hack.
Update: Pull Request has been raised here: https://github.com/MiSTer-devel/Main_MiSTer/pull/534.
Note that this does not solve the problem with Windows NT booting. I don't even think that's IDE-related, since I get the "inaccessible boot device" crash even when I'm trying to boot from setup floppies.
This is awesome, thanks!
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 12:44 pm
by flynnsbit
You might have just fixed OS/2 Warp as well, I know last time I tried it was having difficulty with the IDE drives.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 1:03 pm
by Caldor
flynnsbit wrote: ↑Fri Feb 11, 2022 12:44 pm
You might have just fixed OS/2 Warp as well, I know last time I tried it was having difficulty with the IDE drives.
Some of the many errors installing and running Windows 95 and 98 might also be related to this, but hard to say. I think it is a mix of several things with this core taking a few shortcuts on how to get IDE and floppy drives to work and probably other things as well.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 1:18 pm
by alexoughton
You're welcome, everyone! Thanks to everyone else as well for providing the log output to pin-down where the problem is. Also thanks to any developer who actually provides log output in their code! That's the only reason we had any hope of fixing Linux, and why fixing NT is much less likely (it just crashes, without many clues as to why).
My observations with this patch so far:
Linux (as provided here): Working
Windows NT 3.51: No change (but I'm not surprised)
Windows 95: No change to the yellow exclamation on the secondary IDE controller (also not very surprising)
OS/2 Warp: Not yet tried, but that sounds like fun. Working on it now...
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 1:34 pm
by alexoughton
Well damn. Trying OS/2 Warp, I've discovered that my patch completely breaks ISO mounting. Will have to make some changes.
Update: Actually it's unrelated. I reverted my change and rebuilt from source and I'm still unable to mount ISOs. Moving back to the release version solves the problem. Investigating...
Update 2: Must just be a problem with my build environment. Building from source pulled from the most recent release commit has the same problem, but using the binary downloaded from the repository works without issue. Very strange.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 1:57 pm
by flynnsbit
eComStation 2.2 might have more success over OS/2 Warped (once you fix the ISO booting) as it has some updated drivers for JFS and uses Daniela's Bus Master IDE driver which I know will at least detect the drives and the controller and use it in ATAPI PIO0 mode. I wont be able to test my VHDs I had setup until this evening.
Last I remember is chkdsk wouldn't run and had a Read error. That then bombed the JFSBoot.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 2:27 pm
by alexoughton
Looks like I'm not going to be able to take the testing any further right now. As per my most recent update above, my ISO issue appears to be related to my build environment and nothing to do with this patch. If someone else has a properly-working build environment and can compile a new build using my patch then I can continue looking at other OSes. I'm also interested to hear how well flynnsbit's eComStation VHDs work.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 2:46 pm
by Caldor
I think there was a recent update to MiSTer Main adding support for Guncon 2, maybe that causes the issue?
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 2:47 pm
by alexoughton
Caldor wrote: ↑Fri Feb 11, 2022 2:46 pm
I think there was a recent update to MiSTer Main adding support for Guncon 2, maybe that causes the issue?
It doesn't seem so. I pulled-down the exact version of the code which built the most recent release. When I compile it myself I have the issue. When I use the official binary of that release the issue is gone. Definitely seems to be something wrong with my builds specifically.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 3:09 pm
by alexoughton
Sorgelig just confirmed that ISO mounting works for him with my patch. Good news!
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 3:38 pm
by flynnsbit
yeah ISO mounting is fine, here is a build of it from the auto builder for nightlies:
https://github.com/MiSTer-unstable-nigh ... 211_134a39
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 4:25 pm
by alexoughton
Thanks! The nightly build is working just fine for me, including booting the Linux image.
I just tried installing and booting OS/2 Warp 4. It does not work. The failure is exactly the same as before this patch, where installation appears to be successful but then the installed system will not boot after the reboot.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 8:13 pm
by thorr
Thanks for working on this! I am wondering if a replacement IDE controller could be plugged in like this one:
https://www.latticesemi.com/products/de ... controller or this one:
https://techdocs.altium.com/display/FPG ... Controller
My guess is the IDE controller as it is right now is in pretty good shape, but there are other issues in the core that need to be addressed to fix OS2, DOS4GW, etc.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 9:24 pm
by Bas
Windows NT 3.51 (tested that as a Win95 contemporary) installer also bombs out with an unusable boot device error. Up next is FreeBSD 2.x.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 10:06 pm
by thorr
An unusable boot device would lead me to believe the way the disk is getting written is wrong. Or it could be a compatibility problem with "IDE mode" in the BIOS. I know with newer systems changing between AHCI and Legacy IDE mode will make a system unbootable and it might be related to that rather than what is on the disk itself. It would be interesting to try to boot the VHD after OS2 or Windows NT is installed inside another system (emulator) outside of the MiSTer and see if it can get further during bootup.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 10:09 pm
by flynnsbit
Forgot about this whole bug report, I knew I had it but I guess I closed it. Maybe some insights for those smarter than me:
https://github.com/MiSTer-devel/ao486_MiSTer/issues/25
Sorg:
Code: Select all
IDE controller is PIO-only. So no DMA modes.
Also it has some problem, so even windows uses compatibility mode, i.e. BIOS calls to access the disks.
Welcome to improve it.
Actually BOCHS BIOS forces to use IDE in worst (single sector per time) mode.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 10:11 pm
by alexoughton
The strange thing about the NT boot device issue is that it happens regardless of what you try to boot from. I’ve booted from setup floppies and it fails (at which point the boot device is essentially “in memory”), and I’ve performed a floppy-less installation as well where the setup environment is copied to hard disk. Whichever way I try to boot and whichever version I try (3.51 and 4.0 so far), it fails in exactly the same way. I feel like the error text is a red-herring. Rather than the problem actually being with disk access, I think it’s the DLLs which would provide access to it (regardless of actual backing) failing to start. You don't even get this error if you do a floppy boot on a VM with no HDDs.
Re: IDE Controller Errors in Linux on AO486
Posted: Fri Feb 11, 2022 10:38 pm
by thorr
Re: IDE Controller Errors in Linux on AO486
Posted: Mon Feb 14, 2022 1:28 pm
by Caldor
I have tried installing Windows 98 this weekend and it seemed to have less errors when doing so. Still had one crash though. Will test some more. But I do think some of the issues were IDE controller related and a software fix was found where you renamed a file with the IDE driver I think... but I might be mistaken. It was a "freeze fix". I am also pretty sure that there was something about removing floppy drives to avoid the system freezing when opening My Computer and it would begin checking all disks.
Re: IDE Controller Errors in Linux on AO486
Posted: Mon Feb 14, 2022 1:30 pm
by alexoughton
Caldor wrote: ↑Mon Feb 14, 2022 1:28 pm
I have tried installing Windows 98 this weekend and it seemed to have less errors when doing so. Still had one crash though. Will test some more. But I do think some of the issues were IDE controller related and a software fix was found where you renamed a file with the IDE driver I think... but I might be mistaken. It was a "freeze fix". I am also pretty sure that there was something about removing floppy drives to avoid the system freezing when opening My Computer and it would begin checking all disks.
I've been going through the BIOS and IDE code this weekend to try and solve some other issues and I see there have been some fixes applied in there in the past related to the floppy drives and Windows 98. That could be why you're seeing fewer issues.
Re: IDE Controller Errors in Linux on AO486
Posted: Mon Feb 14, 2022 3:10 pm
by Caldor
alexoughton wrote: ↑Mon Feb 14, 2022 1:30 pm
Caldor wrote: ↑Mon Feb 14, 2022 1:28 pm
I have tried installing Windows 98 this weekend and it seemed to have less errors when doing so. Still had one crash though. Will test some more. But I do think some of the issues were IDE controller related and a software fix was found where you renamed a file with the IDE driver I think... but I might be mistaken. It was a "freeze fix". I am also pretty sure that there was something about removing floppy drives to avoid the system freezing when opening My Computer and it would begin checking all disks.
I've been going through the BIOS and IDE code this weekend to try and solve some other issues and I see there have been some fixes applied in there in the past related to the floppy drives and Windows 98. That could be why you're seeing fewer issues.
That sounds likely. Certainly a lot less issues for me. I did do the first half of the installation using 86Box on my PC and then did the next part on the MiSTer. Then I avoided having to use the option to run the setup without FPU and such.