SCALE new install on Supermicro 4028GR-TRT installs fine, doesn't boot

I’ve got a new (to me) refurb Supermicro X9DRH-ITF. I downloaded the latest SCALE installer and made a boot USB using Rufus.

Hardware:
-Supermicro X9dRH-iTF 4u rack case, 2@1400w PSU, 2@Xeon E5-2690v2 10 core@3GHz, 256GB ECC DDR3 RAM running @1600MHz
All components are builtin:
–LSI 9210-8i JBOD SAS2 (it’s a PCIe card as it turns out, but it has the metal builtin cable going to the risers)
–BPN-SAS2-846EL1 24-port 4U expander backplane
–BPN-SAS2-826EL1 12-port 2U expander backplane
–Intel X540 2@ 10GBE
–Video VGA
–Motherboard SATA ports empty, SATA Boot set “Disabled” in BIOS
–System BIOS 2.15.xx version
–System “Max Int13 Device” set for 2, boot device(s) appear in boot menu
–Boot drives set for Boot (and Alternate Boot) in LSI BIOS using ALT-B

I have tried 2 configurations for boot drives:
-2@Crucial SATA3 SSD 480GB (not SAS)
-1@MDD SAS3 12GB/S 16TB
–I have also tried “override boot” to the HDD from the system BIOS

System boots up fine, finds the USB and boots the Scale installer. I can pick the install drive and alternate (if installed) fine from the list. It then installs Scale with no errors. Whether or not I select “add 16GB swap drive on boot drive” the results are the same. I’ve selected UEFI and BIOS boot with no difference. I note the disk partition create/destroy lines in the boot sequence on-screen messages, they have no errors other than “cannot find partition SWAP”: I see in the install script that it creates da0 p1-p5 partitions, then gives me the error “cannot find SWAP,” then destoys da0, then creates the partitions again with no errors visible. I note that setup “set da0” drive as active with no visible error message.

After installation, I’m asked to reboot, which I do (removing the installation USB), and then the system runs through its POST with no errors, beeps like it’s booted (sits on AMI “B2” POST code for a minute), clears the screen, and gives me a blinking cursor in the upper left of the screen. That’s the last thing it does. No errors, no messages, no logs, no blinking of the HDD carrier, no activity on the drive. Hitting CTRL-ALT-DEL will reboot the system, so it’s responsive. Has anyone else experienced this?

TIA
Fred

Make sure that your BIOS is set to UEFI-Boot-Only, and see if that makes the install work.

I’ll try that, thanks @Protopia

Tried this - didn’t work. System gave me a “insert boot disk or select valid boot option”. Seems the config I set up previously is the only combination that doesn’t give me a boot fail error. Still won’t boot though.

Will be trying a standard Linux and a Windows install to see if one of those boots.

Specifically, try Debian 12; it’s closest to TrueNAS.

WIll do, thank you

You will need to create your installation flash drive in EUFI mode too.

I thought it was working - after installing yesterday, it booted to the SSD as the only drive in the system. Today, I put in a 2nd SSD as the first step toward populating the system, and…no boot! I removed the SSD, still no boot. It’s back to staring at me with the blinking cursor.

I even reconfigured the LSI BIOS with “boot” and “alt boot,” set the boot list in the system BIOS under “HDD boot priority” as both SSD, reinstalled SCALE onto both drives as a redundant pair, and still nothing. No matter what I do with the boot options (EFI/no EFI, legacy/no legacy, BIOS/OS, etc) nothing works. It’s back to “insert a boot device” or blinking cursor.

I am considering updating the BIOS, but the model number of the motherboard claimed by the company that sold it to me is wrong, so I have work to do with them first to resolve what exactly they sold me. Unfortunately…to be continued…

You can look on the motherboard itself for the model.

In certain cases the Supermicro BIOS will poll drives on board and on other backplanes/devices at startup and populate a drive table in BIOS with all drives found and the table is not always in the same order especially if it finds new drives during the poll. The first drive in the table is the one it tries to boot from. If you added a drive, it may now be the first drive and moved the boot drive out of the first slot and boot will fail. This drove me crazy with an older Supermicro board until I figured out what the issue was. The fix is as simple as looking at the boot order in BIOS and setting your boot drive as the first drive again.

I have found if fooling around with the BIOS while troubleshooting to initially pull all drives but the boot drive so the BIOS doesn’t screw up the boot order and write down the serial number of the boot drive. It may come in handy later. Using a boot mirror can also sometimes be problematic to get running again after failure/replacment of one drive in the mirror for various reasons. A general suggestion now as it is so easy to reinstall Scale from scratch, is to not use a mirror for a boot drive, just keep a good copy of the configuration in a safe place to restore after a Scale reinstall. Not using a mirror also helps sort out any boot issues and a boot mirror can be created later if desired.

Fooling with optroms and setting pcie cards as boot etc. can really screw up the BIOS and cause a fail to boot or confuse a person as to what the BIOS is trying to boot from.

All BIOS systems have some sort of safe settings option in the save screen that provides a basic default BIOS setup that almost all systems can boot with. I would reset and start there. Boot into a live Debian and see if Debian can operate successfully with the system. If it can, Scale should too.

Doesn’t EUFI boot look across all drives to find those with EUFI partitions i.e. boot drives? Would this mitigate this issue?

I think I agree with that - it does seem that the BIOS scrambles itself on reboot. I’ve noticed that UEFI settings change when I didn’t alter them, and so forth. However, no matter what I set, the system will boot fine from a USB, but not from a SAS. This particular board doesn’t have an onboard LSI, so that makes it worse, it has to hand off boot duties to an option ROM (that’s the -ITF variant. The IF and 7F variants do have an onboard LSI). It also has an outdated BIOS version (2.1.5 vs 3.3) which I’m sure does not help.

SOLVED? (workaround):

While studying the manual, I noticed there are 10 SATA ports, which I’d disabled before on the assumption that I could boot off HBA. Most of them, on this version of the board, are SATA2, but there are 2 SATA-3s. Going into the system BIOS, I set SATA to AHCI mode, which lit up some options, but I noted on the BIOS screen that SATA ports all still showed “DISABLED” and grayed out - didn’t look promising.
Connecting power for the SATA isn’t easy for me, as I have big hands, and the nearest MOLEX was connected on the bottom of the nearby SAS expander backplane, 4u down with no easy disassembly other than to remove individual fans. I had to install a splitter, and a MOLEX to SATA adapter (which I had to order, and that took a week, thus the delay in resolving this). The usual “MOLEX pin isn’t aligned” didn’t help either.
After threading this through the double row of fans, I put the SATA drive I’d previously loaded with TNAS SCALE (from the SAS bays) into ISATA0 (the white SATA closest to the front bays and toward the board edge), and plugged in the SCALE USB, and prepared for reformatting and reinstalling.
Well, how surprised was I that, on first power up, the drive just booted with the old load of SCALE. It even complained that my mirror drive was missing (I only installed one, I wasn’t expecting it to be this easy). It apparently never even noticed that it was on a SATA, not an LSI SAS.
The only issue so far is that there aren’t any drive bays in that case that aren’t SAS expanders, so the drive is currently floating on a piece of antistatic plastic. I’ll need to figure out a mount for it to keep from rattling around.
Next step, install SAS drives while hoping nothing gets confused about the boot.