Proxmox 8.2.2 + Truenas Scale 24.xx + sata pass through = fail

Host hardware
5800x
64gb
asus strix x570-e motherboard
proper iommu groups, no acs hackery
onboard sata controllers (2) in vfio to truenas

Running truenas core (v12/v13) as a vm on this system for several years now with no issues. Decided to give scale a look. Installed from iso then restored my existing config file.

It runs … if it boots. The issue im running into is it’s loosing the pass through sata controllers during a reboot (or shutdown). It’s as though TN scale is leaving the controllers in an odd state. Upon subsequent reboot, since it can’t find the sata controllers, it hangs. When this happens, even the installer iso does not recognize the controllers (and respective disks attached). I played with various variations of the pci options for the pass through device. This board has a total of 8 onboard sata ports, all of which are passed to the TN vm.

I never experienced this issue with core, and at times it’s been rebooted a number of times. To clear out the above condition, I have to reboot the host itself. For now i’ve reverted back to core, but am interested in figuring this out.

Thoughts/suggestions? Thanks


Sata1 = 07.00
Sata2 = 08.00
The other 2 devices are nics.

The issue definitely has something to do with the state of the sata controllers during/or following the shutdown of the TN scale vm.

I was able to mitigate the issue by using a hookscript in proxmox to effectively reset the controllers before a boot and right after a shutdown. The problem is this script only works during those 2 functions. It does not get actuated if a “reboot” is select - whether within the vm, or proxmox.

system("echo 1 > '/sys/bus/pci/devices/0000:07:00.0/remove'");
system("echo 1 > '/sys/bus/pci/devices/0000:08:00.0/remove'");
system("sleep 1");

system("echo 1 > '/sys/bus/pci/rescan'");
system("sleep 4"); 

The above code goes in the pre-startup and post-shutdown trigger portions of the hookscript. Really only one is necessary, probably the pre-startup as that creates a viable environment before launching the vm. The post shutdown is to make those pcie devices available for another vm.

Credit goes here [SOLVED] - Passthrough of onboard SATA controller locks up system | Proxmox Support Forum

I did test with just the reset function, but whatever TN scale is doing, that’s not enough. The remove/rescan function does reset the controllers to where subsequent tn scale reboots are successful. It’s also interesting to note. The problem seems to happen during “reboots”, not shutdown/starts.

Any suggestions for some kind of hookscript within the tn scale vm so just before it reboots, it resets the controllers differently?