TrueNAS install on Proxmox with NVMe drives passthrough

Hi there,
I plan to have Truenas Fileserver running as VM on a PVE host in an Active Directory Environment. The machine is a HP DL380 Gen10 server which has an NVMe riser card installed, already hosting 3x U2 drives as storage for the VMs.

The remaining 5x I want to PCIe passthrough to the Truenas VM which seems easier than expected. No nasty RAID controller interfering and clean passthrough is theoretically possible as each drive comes in a separate IOMMU group.

Before I go shopping them drives (Micron 7400 PRO 3.84TB U.3): Did I miss anything? I just hope that when passing them through, “automatic” disabling on the PVE host will work because otherwise I don’t see any chance of doing it driver-wise. edit: Just found this in case of such issues:

If you do early binding to vfio-pci, you don’t have to blacklist drivers. Just make sure vfio-pci loads first using a softdep.

Thanks for your input!

Hello,
maybe anyone could just let me know if this project is either very uncommon or very unproblematic (or both). :wink:

Or does anyone have a clue how I would check that the PCIe devices/drives are unavailable to the PVE host after adding them as passthrough device? I guess they would just disappear from the list (lspci … ).

I thought I wouldn’t be the first one with a setup like that.

Thanks!

The recommendation, if you want to virtualize truenas is to passthrough the controller, not the disks. There have been cases with dataloss when only disk passthrough was used.

Hi Lars,
thanks for your input.

I thoght the PCIe riser card does not include any kind of HBA, RAID, etc. controller but acts more or less like a breakout box. Therefore passing through them single drives would be the way to go.

This is the riser card here which works for 8 U2 drives.

Did I miss anything here?
Cheers

PCI passthrough of the NVME drives as discreet PCI devices is somewhat safe, as long as proxmox never ever sees the ZFS pools that are on there.
That beeing said:
There have been numerous reports here on the forum in the recent past of ppl losing their pools under proxmox despite doing everything “right”.

There is something fishy going on and I dont recommend proxmox at the moment as a hypervisor for Truenas.

2 Likes

You are correct that a NVMe drive is its own controller, so that passing through a NVMe device is equivalent to “passing the HBA” for SAS/SATA.
With ESXi, that would do.
With Proxmox, it’s not clear whether one should also blacklist the devices and/or stop the ZFS scanning service, and whether this makes the virtualised installation secure. If you do proceed, make sur to have reliable backups!

Weeeeeeeell … that’s kinda tough - I mean this could potentially mean havoc for a lot of users, correct? On TrueNASs website you can still read of Proxmox being 2nd rank of choice for virtualised installs.

So there currently is no proof of how this happens and how to prevent it? Also no failsafe procedure how one could test if it works or not? I could setup a test pool but wouldn’t want to have a 3 months testrun …

How could I blacklist those devices? To make matters worse for me I have setup a ZFS pool within PVE using 3 U2 drives connected to the same riser as a nice and fast VM storage. AND of course this thing is meant to have a backup but also be used in production! :sweat_smile:

My opignion is that if you wanna have a reliable ZFS based storage, you should install Truenas on bare metal.

You can still do performant VMs. At least on SCALE., never tried it on CORE.

If your main thing is VMs/ proxmox and you want some storage on the side → use OMV or any other non ZFS based NAS software.

1 Like

I will admit, I am not a U.2 nor HP Gen 10 server user however here are a few thoughts… (things I learned while building my M.2 server)

  1. Are there enough PCIe lanes to service all these drives? I’m only use to the M.2 media where mine require 4 PCIe lanes each (they will run at reduced speed with 2 lanes). I can pass through each NVMe drive because I do not have a main controller but rather the NVMe is the controller. I also use ESXi so not Proxmox.
  2. If I were to say that each U.2 uses four PCIe lanes, and you are using 8 U.2 drives, that means you have to have at least 32 PCIe lanes just for the U.2 drives. Maybe they can run using a reduced number of PCIe lanes?
  3. Is that riser card, just a riser card? Can you provide the riser card make/model so I can check it out? I’m all about learning things.

Do all your research before spending any money.

Are you using the PAID version of Proxmox for a “production” server? As I said, not a Proxmox user yet although I did download it 2 days ago as I do want to give it a test drive since ESXi (free) is not getting new updates.

thanks for your thoughts!
Basically, the idea behind having a TrueNAS Fileserver for our Active Directory environment on the same machine as my virtual machines (AD controller, etc.) is following econonmic and pragmatic reasons.

Why having to buy another machine while I have those 5x damn fast U2 slots ready at hand? PCIe lanes there are plenty as the server sports dual 14 core XEONs. Why having to make choices again concerning a suitable controller for SSD storage which is realistically just a compromise then - “only” SATA drives? It would just have been a nice cheap trick.

I also thought about having the Proxmox host taking care of the ZFS pool and something else manage that volume. But I’d lose all nice reporting features for drive health and the nice GUI (while at the same time not having it for my VM storage to be honest :sweat_smile:).

Rather thinking about waiting a few days (as this seems a rather hot issue) and see if someone comes up with more reliable information and possibly some awareness from the Proxmox Team. Using a free version btw.

If it would even be risky to have a Proxmox/Truenas Combo with “normal” HBAs this is an issue which might concern a lot of people and therefore would possibly be adressed, wouldn’t it?

Above I posted a link to the riser card, here again. Do you think those Micron drives aren’t worth it? SSD storage tends to be pricey right now and I did not find any negative comments about them.

just to make sure - what is mentioned at the end of this post is not related to our issue here? I had put an excerpt in my original post above.

The guy explicitly mentions that devices have to be claimed first by the vfio-pci driver to be supplied exclusively to VMs and not to the host. This could be done via modprobe defining softdeps.

I’ve been running a Dell R520 for almost 2 years without issues in a somewhat similar setup in a home office. Card H710 Mini flashed in IT Mode (H710 D1 Mini - Fohdeesha Docs) and passthrough to a truenas VM serving all networks; hosted in a Proxmox box I named Brave… lol.

I even upgraded it from Core to Scale - or rather, deployed Scale and then imported the ZFS pool from the previous Core installation, or something along those lines; can’t recall.

I haven’t encountered any issues with replacing pool disks or expanding the pool, as I did about a month ago.

Currently, I have a single 2.5" Crucial SSD (using a bay adapter as the proxmox boot drive, replacing the DVD drive), along with a couple of NVMe drives in PCIe adapters (Amazon.com).

I do keep backups of the installation and its data. But it runs smooth and it is perfectly doable.

heyho,
@etorix @Farout
would you like to give a comment about the early-binding to the vfio-pci driver that I mentioned? In the meantime (cough, cough) I also found it in the Proxmox documentation:

you might need to set a soft dependency to load the gpu modules before > loading vfio-pci. This can be done with the softdep flag, see also the manpages on modprobe.d for more information.

For example, if you are using drivers named “some-module”:

# echo "softdep <some-module> pre: vfio-pci" >> /etc/modprobe.d/<some-module>.conf

early bindings haven’t been mentioned in the other posts as far as I can see it, so it is unclear if this had been implemented. To me it seems like a mandatory step when passing through drives to a VM …

In the meantime I ditched the idea with the NVME drives cause I wouldn’t be able to separate my VM storage from the TrueNAS storage. Selectively disabling NVME drivers would not work or be too risky.

So I got an LSI 3008. Then to be safe I blacklisted the mpt3sas driver and gave the vfio-pci as an option as described in the Proxmox documentation or in this thread.

That’s because when the VM with the PCIe passthrough is powered off, Proxmox reverts to the “original” driver of the device and the connected drives become accessible to the Proxmox host, like described here.

I made the experience that, when the VM is switched on, everything works fine but if it is switched off and you want to hide your controller/drives you gotta make sure the PVE host does not mess around with it …

1 Like