A couple of days ago I migrated from CORE to SCALE, which uses GRUB as its bootloader. Since then, every time I (re)boot:
Behavior:
- The system POSTs with no issues, and UEFI finds GRUB promptly every time.
- “Welcome to GRUB!” appears, and nothing else, for 20-40 minutes. During this time my HDDs activity lights flash in a random but repeated pattern.
- Once it’s done sitting, it proceeds to boot and operate normally.
System:
- Supermicro H13SSL-NT with AMD EPYC 9124 (16-core 3GHz)
- 256GB DDR5 ECC JEDEC RAM
- Chelsio T6225-CR 2x100Gb NIC
- Boot: Seagate 512GB M.2 FireCuda 530
- 2 x Broadcom LSI 9500-16i HBAs (fw ver 37.00.00.00)
- AIC BBPHHD40013B backplane (LSI 35X28)
- 60 x Seagate Exos X20, 20TB HDDs (6×10raidz2)
- 2 x Micron 7450 U.3 3TB SSDs for metadata (special vdev, mirror) (latest fw)
- 1 x 16TB Solidigm D5-P5336 PCIe SSD (it’s a scratch disk for users)
- 4 x AcBel R1CA2801A PSU
I performed a gradual upgrade path, from CORE 13.3 to SCALE 24.04.2.5, to 24.10, to 25.04, to 25.10.1. It took two days… none of those updates changed the bootloading behavior.
Tried:
- Updating the BIOS to latest (3.8)
- Disabling the OPROM for the LSI HBAs
- Enabling PCI AER support
- Disabling CSM
No change.
TrueNAS GRUB Config + temp workaround:
It took me a while to determine it wasn’t just freezing or stuck in an infinite loop… clearly GRUB initialized, but this was happening before the menu ever appeared.
I found that GRUB’s search command causes the hang.
In the boot disk’s FAT ESP partition (hd62,gpt2 in my case), if I go to EFI/debian/grub.cfg and replace the search command with set root=(hd62,gpt3) , the GRUB menu will come up instantly. But… then the next half of the bootloading (picking the default menu option) hangs in the very same way.
If I go to /boot/grub/grub.cfg and patch that menu’s relevant search line, replacing it with the same set root= , then I get TrueNAS to bootload with no delays and no hangs.
grub.cfg of course starts with “DO NOT MODIFY THIS FILE” because sure enough, it gets rewritten (almost?) every boot. Not to mention that this workaround would break if the device enumeration were to change, or at the next TrueNAS update.
Notes:
-
There are many reports from users having the exact same problem since switching to GRUB (SCALE), I’ll link to all the threads below — but none ever had proper solutions. Some reverted to legacy BIOS, someone had thousands of boot entries, and several never had any fix.
Interestingly, all of them are from 2021-2023, most are with Supermicro boards… and despite GRUB being a common tool, these posts were all about TrueNAS.
-
There was an iX bug opened NAS-121676 , saying “master is broken” in 2023…and then closed?
-
ls -l , which is an undocument GRUB argument, lists the partitions inside each disk including the ZFS label — and it takes forever, ~70 seconds per disk. I’m curious if on non-affected systems it runs fast.
-
Is there a way to gain more visibility into what’s going on during the GRUB search?
Other posts:
https://ixsystems.atlassian.net/browse/NAS-121676
https://www.truenas.com/community/threads/stuck-on-welcome-to-grub.105795
https://www.truenas.com/community/threads/stuck-on-welcome-to-grub.105629/
https://www.truenas.com/community/threads/truenas-scale-grub-usb.95102/
https://www.reddit.com/r/truenas/comments/11a5x62/grub_impossibly_slow/