TrueNAS under ProxMox - all data lost in under a week

For many years I have used a Windows 10 ‘server’ to store my “wanted, but not mission critical” data. Along side this I have used a small commercial NAS for “mission critical” data. This has worked flawlessly with not as much as a single byte of a single file lost. However, I was uncomfortable with the fact that the Windows server had no redundancy. To solve this I wanted to switch to TrueNAS.

I also had some small need for Windows or other VMs on that server at times, so after studying the options I decided that the easiest way to achieve this - including moving all the Windows data to the TrueNAS storage, was to set up ProxMox and run TrueNAS under that along with a Windows VM to access the old data and copy it to the new TrueNAS storage.

The PC Hardware is:

  • Asus B150M-A/M.2 motherboard
  • Intel Core i5-6500 3.20GHz
  • 16GB DDR4 RAM
  • Samsung 250GB EVO SATA SSD as boot drive
  • 1 x ST3000DM001 3TB SATA HDD
  • 2 x ST3000DM007 3TB SATA HDD
  • 2 x TOSHIBA HDWD120 2TB SATA HDD

The ST3000DM001 drive was empty at the start, and all other HDDs were basically full. I used Disk2VHD to archive the main Windows OS then installed ProxMox 9.1.1 on the SSD and set up a TrueNAS 25.10.2.1 with the empty ST3000DM001 added as ‘raw’ as the only storage. On this I set up a single pool “MainPool” with a single VDEV. I then added a new Windows 11 VM under ProxMox and added all of the other HDDs as ‘raw’.

This gave me a new TrueNAS system with 3TB free (with zero redundancy) and a Windows System with about 10TB of data. Under this Windows VM I then moved all the data from one of the NTFS drives to TrueNAS. This worked well and gave ne another empty 3TB drive.

I then shut down both VMs and moved the newly empty 3TB drive from the Windows VM to TrueNAS. Under TrueNAS I set this as a separate VDEV so that I could have a total of 6TB of store (currently still with no redundancy) on the new TrueNAS store. So, 6TB on TrueNAS and 7TB on Windows.

I then booted up both VMs and continued to transfer data to over. At some point during all of this, the ProxMox host had an issue with a full disk and pulled down everything! I managed to fix the full disk and brought ProxMox and TrueNAS back up, but TrueNAS now shows no VDEVd and 2 disks with “exported pools”. Import Pools shows no pools to import.

Running a shell on the TrueNAS server and typing “zpool import MainPool” gives:

“cannot import ‘MainPool’: I/O error
Destroy and re-create the pool from
a backup source.

I don’t understand how everything could be lost here. If this was a Windows server with 2 drives and no redundancy I might have lost a few files - but never everything on every drive on the server. Surely there must be a way to recover this to some extent… Big companies supposedly use TrueNAS for important data… is it really a house of cards?

Can anyone help me?

Did you in some manner isolate the disk controller to the TrueNAS VM. Proxmox understands ZFS and will attempt to access any ZFS pool that it sees - this is a disaster for your data.

Basically the way to do this is to have a seperate disk controller (LSI HBA) with all the TrueNAS disks on it and configure proxmox that it cannot even see the controller or the disks on it

4 Likes

You have to set up TrueNAS on Proxmox very specifically. Pass through of the entire disk controller (HBA) and black listing it so Proxmox will not attempt to mount it. You want the ‘Production’ setup.

If you set up your pool as two striped VDEVs in a single pool, losing one disk will cause pool and data loss.

VIRTUAL

BASICS

TrueNAS Systems pool layout whitepaper
White Papers | TrueNAS - Open Enterprise Storage search for ZFS Storage Pool Layout

My system has a six-port SATA controller and no other disk controllers (other than the ability to add a NVMe drive). Do I need to try something like finding a separate SATA controller and adding that, or perhaps get an NVME drive and move ProxMox to that and give the whole SATA controller to TrueNAS… or is it too late and ProxMox will have taken the opportunity to trash my pool “for fun” as it were?

I am sorry to hear about you dataloss. I truely am, but you set up your NAS in the worst possible way. It would be interesting to know, where you got the info to do it like that.

I/O Errors, especially stemming from a wrong virtualisation setup are really hard, often impossible to resolve.

Your best bet is a data recovery programm such as Klennet or UFS-Explorer.

A disk failure in a stripe takes down everything

All of that.

  • Proxmox probably tried to access your pools while they were mounted into ZFS. This is hardly recoverable from, and compounded by the lack of redundancy.
  • You absolutely must passthrough a full SATA or SAS controller as PCIe device to TrueNAS AND blacklist the device in Proxmox. A dedicated SAS HBA is the usual way, but if your motherboard has proper PCIe grouping (I would not bet on that with a cheap consumer mobo…) you may passthrough the SATA controller and use a NVMe drive for the Proxmost host.

I’m afraid this is a hard lesson, but bare metal is safer, and redundancy is a must.
Without redundancy, the sophistication of ZFS will actually harm you compared to other file systems.

1 Like

I don’t work with Proxmox but would it be worth a try of doing zpool import from the Proxmox command line to see if Proxmox will import the pool or attempt too? If attempting, I would keep the TrueNAS VM powered off.