IDE Controller Errors in Linux on AO486

sajattack
Core Developer
Posts: 35
Joined: Sun May 24, 2020 6:50 pm
Location: BC, Canada
Has thanked: 4 times
Been thanked: 17 times
Contact:

IDE Controller Errors in Linux on AO486

Unread post by sajattack »

I created a gentoo linux image https://nextcloud.paulsajna.com/index.p ... pCCH3cBFMo for 486 without fpu, and it boots the kernel, but fails to mount the rootfs due to some IDE errors. Same image boots fine in qemu with command

Code: Select all

qemu-system-i386 -hda gentoo.vhd -cpu 486,-fpu
I'm starting to think there is a bug or omission in the IDE controller for ao486 on MiSTer. Can anyone help me narrow it down?
20220115_180232-screen.png
20220115_180232-screen.png (95.95 KiB) Viewed 14917 times
20220115_182555-screen.png
20220115_182555-screen.png (113.67 KiB) Viewed 14917 times
Bas
Top Contributor
Posts: 623
Joined: Fri Jan 22, 2021 4:36 pm
Has thanked: 80 times
Been thanked: 324 times

Re: IDE Controller Errors in Linux on AO486

Unread post by Bas »

This or similar happens with Windows NT, OS/2 and the BSD's as well. The IDE controller appears to be quite minimal and just good enough for DOS.
user7182
Posts: 26
Joined: Sun Nov 21, 2021 3:30 pm
Has thanked: 22 times
Been thanked: 60 times

Re: IDE Controller Errors in Linux on AO486

Unread post by user7182 »

I can't offer any technical advice here, I'm not familiar with IDE at all. I ran your image and looked at the logs which might help if your trying to narrow it down.

Gentoo is sending a READ MULTIPLE command (0xC4), and the IDE controller is returning an abort (seen in your screen shots). This MiSTer log confirms this:

Code: Select all

IDE regs:
   io_done:  01
   features: 00
   sec_cnt:  01
   sector:   00
   cylinder: 0000
   head:     00
   drv:      00
   lba:      01
   command:  C4
IDE command: C4 (on 0)
(!) Read multiple is disabled!
That command is supported, you can see it handled:
https://github.com/MiSTer-devel/Main_Mi ... e.cpp#L887

But it returns an error if .spb isn't set (not sure what that is, this where knowing something about IDE controller would help), and that's only set on a SET MULTIPLE command (0xC6). Should Gentoo have sent a SET MULTIPLE command? What does it do?
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

@sajattack

I'm experimenting with a patch to the IDE code which I'm hoping may solve this issue. I'm trying to download your image though, and the page is not working (gives a "Nextcloud" error). Can you assist with getting access to the image please?

"SPB" is "Sectors Per Block". Some of the documentation I'm reading online implies there should be a default value for this which applies whenever SET MUTLIPLE has not been used to override it. My experiment is to set this to the same value as word 59 of IDENTIFY ("multiple sectors"), which is 0x110.
User avatar
macro
Core Developer
Posts: 141
Joined: Sun May 24, 2020 4:12 pm
Been thanked: 171 times

Re: IDE Controller Errors in Linux on AO486

Unread post by macro »

quick google ...

A block, on the other hand, is a group of sectors that the operating system can address (point to). A block might be one sector, or it might be several sectors (2,4,8, or even 16). The bigger the drive, the more sectors that a block will hold.

so if the core returns 1 sector at a time, then I guess it should be set to 1
Did I do something useful?

buy me a coffee
User avatar
toastboy
Posts: 69
Joined: Wed Sep 09, 2020 9:20 pm
Has thanked: 13 times
Been thanked: 20 times

Re: IDE Controller Errors in Linux on AO486

Unread post by toastboy »

alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

Thanks both. The original download link started working again, so I downloaded that successfully.

My patch (set to 0x110) does allow the kernel to get further into the boot process, but it then fails on some other disk related errors. I'm going to keep playing with it and see if I can get any further than this.

Regarding setting it to "1", yes I've considered that too. Looking in the IDE code though, it does look like multiple-sector reads are supposed to be working, which is why I've started with a derived default rather than just short-circuiting to 1 sector. I will test with "1" next and see if it is a quick way to get further along.

Regarding the log output which @user7182 is showing, where can I find this? Looking around on SSH I'm not seeing anything obvious for a location of log files. Never mind. I got the log through the UART port.

Update: Setting spb to "1" or rewriting the code to just do a non-multi read if spb is not set both actually make things worse (the failure messages from the original report return). The boot process gets furthest along with this set to 0x110. However shortly afterwards the Kernel gets stuck again with the following error:

Code: Select all

ata1: lost interrupt (Status 0x58)
Google suggests that if this were a "real" system it means the drive stopped responding to commands. There's nothing in the MiSTer log at the time this is happening, so I may have to add some more debug messages to the code to find out what's happening here. This will have to wait for later though, as I should go and do some "real work" now!


Update 2: I feel like this is actually going somewhere. Turning on the debug log options I saw some "READ MULTI" happening with a block size of 8. I tried that as the hardset value in the code, and Linux actually mounted the EXT4 file system! It's still throwing other errors relating to READ MULTI and fails to continue booting shortly after that, but there's definitely progress.
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

Oh my goodness, it works. The answer was 16. Linux has booted and has even gone all the way through a successful fsck.

You can try my compiled version of the main MiSTer binary here: REMOVED. Please now use a build from nightlies or an official release after 2/11/2022.

Make sure you back-up the copy on the root of your SD card and then place this copy there instead. All I've done is change line 426 of ide.cpp to set spb to 16 instead of 0. I'll raise this in a formal bug report shortly to see if we can get this properly fixed instead of my hack.

Update: Pull Request has been raised here: https://github.com/MiSTer-devel/Main_MiSTer/pull/534.

Note that this does not solve the problem with Windows NT booting. I don't even think that's IDE-related, since I get the "inaccessible boot device" crash even when I'm trying to boot from setup floppies.
User avatar
wark91
Core Developer
Posts: 334
Joined: Sun May 24, 2020 8:34 pm
Has thanked: 447 times
Been thanked: 95 times

Re: IDE Controller Errors in Linux on AO486

Unread post by wark91 »

Thank you !
user7182
Posts: 26
Joined: Sun Nov 21, 2021 3:30 pm
Has thanked: 22 times
Been thanked: 60 times

Re: IDE Controller Errors in Linux on AO486

Unread post by user7182 »

alexoughton wrote: Thu Feb 10, 2022 11:07 pm

Oh my goodness, it works. The answer was 16. Linux has booted and has even gone all the way through a successful fsck.

Make sure you back-up the copy on the root of your SD card and then place this copy there instead. All I've done is change line 426 of ide.cpp to set spb to 16 instead of 0. I'll raise this in a formal bug report shortly to see if we can get this properly fixed instead of my hack.

Update: Pull Request has been raised here: https://github.com/MiSTer-devel/Main_MiSTer/pull/534.

Note that this does not solve the problem with Windows NT booting. I don't even think that's IDE-related, since I get the "inaccessible boot device" crash even when I'm trying to boot from setup floppies.

This is awesome, thanks!

flynnsbit
Top Contributor
Posts: 552
Joined: Sun May 24, 2020 8:07 pm
Has thanked: 185 times
Been thanked: 310 times
Contact:

Re: IDE Controller Errors in Linux on AO486

Unread post by flynnsbit »

You might have just fixed OS/2 Warp as well, I know last time I tried it was having difficulty with the IDE drives.
User avatar
Caldor
Top Contributor
Posts: 930
Joined: Sat Jul 25, 2020 11:20 am
Has thanked: 112 times
Been thanked: 111 times

Re: IDE Controller Errors in Linux on AO486

Unread post by Caldor »

flynnsbit wrote: Fri Feb 11, 2022 12:44 pm You might have just fixed OS/2 Warp as well, I know last time I tried it was having difficulty with the IDE drives.
Some of the many errors installing and running Windows 95 and 98 might also be related to this, but hard to say. I think it is a mix of several things with this core taking a few shortcuts on how to get IDE and floppy drives to work and probably other things as well.
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

You're welcome, everyone! Thanks to everyone else as well for providing the log output to pin-down where the problem is. Also thanks to any developer who actually provides log output in their code! That's the only reason we had any hope of fixing Linux, and why fixing NT is much less likely (it just crashes, without many clues as to why).

My observations with this patch so far:

Linux (as provided here): Working
Windows NT 3.51: No change (but I'm not surprised)
Windows 95: No change to the yellow exclamation on the secondary IDE controller (also not very surprising)
OS/2 Warp: Not yet tried, but that sounds like fun. Working on it now...
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

Well damn. Trying OS/2 Warp, I've discovered that my patch completely breaks ISO mounting. Will have to make some changes.

Update: Actually it's unrelated. I reverted my change and rebuilt from source and I'm still unable to mount ISOs. Moving back to the release version solves the problem. Investigating...

Update 2: Must just be a problem with my build environment. Building from source pulled from the most recent release commit has the same problem, but using the binary downloaded from the repository works without issue. Very strange.
flynnsbit
Top Contributor
Posts: 552
Joined: Sun May 24, 2020 8:07 pm
Has thanked: 185 times
Been thanked: 310 times
Contact:

Re: IDE Controller Errors in Linux on AO486

Unread post by flynnsbit »

eComStation 2.2 might have more success over OS/2 Warped (once you fix the ISO booting) as it has some updated drivers for JFS and uses Daniela's Bus Master IDE driver which I know will at least detect the drives and the controller and use it in ATAPI PIO0 mode. I wont be able to test my VHDs I had setup until this evening.

Last I remember is chkdsk wouldn't run and had a Read error. That then bombed the JFSBoot.
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

Looks like I'm not going to be able to take the testing any further right now. As per my most recent update above, my ISO issue appears to be related to my build environment and nothing to do with this patch. If someone else has a properly-working build environment and can compile a new build using my patch then I can continue looking at other OSes. I'm also interested to hear how well flynnsbit's eComStation VHDs work.
User avatar
Caldor
Top Contributor
Posts: 930
Joined: Sat Jul 25, 2020 11:20 am
Has thanked: 112 times
Been thanked: 111 times

Re: IDE Controller Errors in Linux on AO486

Unread post by Caldor »

I think there was a recent update to MiSTer Main adding support for Guncon 2, maybe that causes the issue?
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

Caldor wrote: Fri Feb 11, 2022 2:46 pm I think there was a recent update to MiSTer Main adding support for Guncon 2, maybe that causes the issue?
It doesn't seem so. I pulled-down the exact version of the code which built the most recent release. When I compile it myself I have the issue. When I use the official binary of that release the issue is gone. Definitely seems to be something wrong with my builds specifically.
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

Sorgelig just confirmed that ISO mounting works for him with my patch. Good news!
flynnsbit
Top Contributor
Posts: 552
Joined: Sun May 24, 2020 8:07 pm
Has thanked: 185 times
Been thanked: 310 times
Contact:

Re: IDE Controller Errors in Linux on AO486

Unread post by flynnsbit »

yeah ISO mounting is fine, here is a build of it from the auto builder for nightlies: https://github.com/MiSTer-unstable-nigh ... 211_134a39
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

Thanks! The nightly build is working just fine for me, including booting the Linux image.

I just tried installing and booting OS/2 Warp 4. It does not work. The failure is exactly the same as before this patch, where installation appears to be successful but then the installed system will not boot after the reboot.
thorr
Top Contributor
Posts: 1311
Joined: Mon Jul 06, 2020 9:37 pm
Has thanked: 634 times
Been thanked: 308 times

Re: IDE Controller Errors in Linux on AO486

Unread post by thorr »

Thanks for working on this! I am wondering if a replacement IDE controller could be plugged in like this one: https://www.latticesemi.com/products/de ... controller or this one: https://techdocs.altium.com/display/FPG ... Controller

My guess is the IDE controller as it is right now is in pretty good shape, but there are other issues in the core that need to be addressed to fix OS2, DOS4GW, etc.
Bas
Top Contributor
Posts: 623
Joined: Fri Jan 22, 2021 4:36 pm
Has thanked: 80 times
Been thanked: 324 times

Re: IDE Controller Errors in Linux on AO486

Unread post by Bas »

Windows NT 3.51 (tested that as a Win95 contemporary) installer also bombs out with an unusable boot device error. Up next is FreeBSD 2.x.
thorr
Top Contributor
Posts: 1311
Joined: Mon Jul 06, 2020 9:37 pm
Has thanked: 634 times
Been thanked: 308 times

Re: IDE Controller Errors in Linux on AO486

Unread post by thorr »

An unusable boot device would lead me to believe the way the disk is getting written is wrong. Or it could be a compatibility problem with "IDE mode" in the BIOS. I know with newer systems changing between AHCI and Legacy IDE mode will make a system unbootable and it might be related to that rather than what is on the disk itself. It would be interesting to try to boot the VHD after OS2 or Windows NT is installed inside another system (emulator) outside of the MiSTer and see if it can get further during bootup.
flynnsbit
Top Contributor
Posts: 552
Joined: Sun May 24, 2020 8:07 pm
Has thanked: 185 times
Been thanked: 310 times
Contact:

Re: IDE Controller Errors in Linux on AO486

Unread post by flynnsbit »

Forgot about this whole bug report, I knew I had it but I guess I closed it. Maybe some insights for those smarter than me: https://github.com/MiSTer-devel/ao486_MiSTer/issues/25

Sorg:

Code: Select all

IDE controller is PIO-only. So no DMA modes.
Also it has some problem, so even windows uses compatibility mode, i.e. BIOS calls to access the disks.
Welcome to improve it.
Actually BOCHS BIOS forces to use IDE in worst (single sector per time) mode.
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

The strange thing about the NT boot device issue is that it happens regardless of what you try to boot from. I’ve booted from setup floppies and it fails (at which point the boot device is essentially “in memory”), and I’ve performed a floppy-less installation as well where the setup environment is copied to hard disk. Whichever way I try to boot and whichever version I try (3.51 and 4.0 so far), it fails in exactly the same way. I feel like the error text is a red-herring. Rather than the problem actually being with disk access, I think it’s the DLLs which would provide access to it (regardless of actual backing) failing to start. You don't even get this error if you do a floppy boot on a VM with no HDDs.
thorr
Top Contributor
Posts: 1311
Joined: Mon Jul 06, 2020 9:37 pm
Has thanked: 634 times
Been thanked: 308 times

Re: IDE Controller Errors in Linux on AO486

Unread post by thorr »

I wonder if there are there hardware compatibility DOS testing utilities that can be run that would give a list of things to fix. A quick google found these:
https://www.pc-doctor.com/solutions/pc-doctor-for-dos (not free) / https://www.pc-doctor.com/images/test_l ... t_List.pdf
https://www.bttr-software.de/freesoft/system.htm
https://www.hwinfo.com/download/
User avatar
Caldor
Top Contributor
Posts: 930
Joined: Sat Jul 25, 2020 11:20 am
Has thanked: 112 times
Been thanked: 111 times

Re: IDE Controller Errors in Linux on AO486

Unread post by Caldor »

I have tried installing Windows 98 this weekend and it seemed to have less errors when doing so. Still had one crash though. Will test some more. But I do think some of the issues were IDE controller related and a software fix was found where you renamed a file with the IDE driver I think... but I might be mistaken. It was a "freeze fix". I am also pretty sure that there was something about removing floppy drives to avoid the system freezing when opening My Computer and it would begin checking all disks.
alexoughton
Posts: 75
Joined: Wed Feb 09, 2022 7:57 pm
Has thanked: 15 times
Been thanked: 57 times

Re: IDE Controller Errors in Linux on AO486

Unread post by alexoughton »

Caldor wrote: Mon Feb 14, 2022 1:28 pm I have tried installing Windows 98 this weekend and it seemed to have less errors when doing so. Still had one crash though. Will test some more. But I do think some of the issues were IDE controller related and a software fix was found where you renamed a file with the IDE driver I think... but I might be mistaken. It was a "freeze fix". I am also pretty sure that there was something about removing floppy drives to avoid the system freezing when opening My Computer and it would begin checking all disks.
I've been going through the BIOS and IDE code this weekend to try and solve some other issues and I see there have been some fixes applied in there in the past related to the floppy drives and Windows 98. That could be why you're seeing fewer issues.
User avatar
Caldor
Top Contributor
Posts: 930
Joined: Sat Jul 25, 2020 11:20 am
Has thanked: 112 times
Been thanked: 111 times

Re: IDE Controller Errors in Linux on AO486

Unread post by Caldor »

alexoughton wrote: Mon Feb 14, 2022 1:30 pm
Caldor wrote: Mon Feb 14, 2022 1:28 pm I have tried installing Windows 98 this weekend and it seemed to have less errors when doing so. Still had one crash though. Will test some more. But I do think some of the issues were IDE controller related and a software fix was found where you renamed a file with the IDE driver I think... but I might be mistaken. It was a "freeze fix". I am also pretty sure that there was something about removing floppy drives to avoid the system freezing when opening My Computer and it would begin checking all disks.
I've been going through the BIOS and IDE code this weekend to try and solve some other issues and I see there have been some fixes applied in there in the past related to the floppy drives and Windows 98. That could be why you're seeing fewer issues.
That sounds likely. Certainly a lot less issues for me. I did do the first half of the installation using 86Box on my PC and then did the next part on the MiSTer. Then I avoided having to use the option to run the setup without FPU and such.
Post Reply