Moved Truenas From Bare Metal to Proxmox VM - input/output errors and can't get to data

vexter0944 · August 20, 2025, 10:32pm

Cross posting my thread on r/truenas since no one has responded yet. I sure hope someone will have some intput/help - I’m lost big time.

I had truenas scale 24 on a standalone server (intel i3-7100T, 64 gb ecc and 8x12tb WD drives with a HBA).

I’ve been working towards a new setup and pulled the trigger and moved all the drives from the old server to a new one that is a VM under proxmox (i10 intel cpu with 128gb of ram for Proxmox - gave this 4 cpu and 32gb of ram and passed the new hba into the proxmox vm)

I did an export/disconnect of the drives on the old truenas server and then imported the pool and restored settings from back on the new VM (also scale 24) and everything came back as expected - no issues that I could see.

I then upgraded from scale 24 to 25 and that went well - no issues.

I then added 4 new drives to the pool (2 vdevs of 2x8gb drives) so I could expand my pool.

Then I got some kind of issue/warning about mismatched drive sizes in the pool inside of Truenas, so I pulled those drives from the tank pool and left the original 8x12 just like they were on the old server.

Then I started to notice some issues - I could not and still cannot copy files from the old pool to the new pool without some kind of issue. So I ran a scrub, saw the issues noted in pool status -v - I replaced any files that were noted as corrupted. Re-ran another scrub, which just finished and showed no known errors when I looked at it.

So I tried to copy a movie file from the old movie pool to the new movie pool (the new pool with 4x8gb drives) and I get an error from the shell using cp of “input/output” error and it will not copy the file. It creates the folder on the new pool, but then fails with this input/output error. And now if I do a zpool status -v any file I tried to copy now has permanent corruption. I don’t under stand why this is happening when a full scrub shows gtg right after it’s done and now trying to access a file and copying it makes it corrupt. Including last zpool status -v output. I sure hope someone can help out - I’m puzzled!

pool: tank

state: ONLINE

status: One or more devices has experienced an error resulting in data

corruption. Applications may be affected.

action: Restore the file in question if possible. Otherwise restore the

entire pool from backup.

see: openzfs-docs/msg/ZFS-8000-8A

scan: scrub repaired 0B in 18:17:41 with 0 errors on Wed Aug 20 09:41:41 2025

remove: Removal of vdev 4 copied 826M in 0h0m, completed on Sun Aug 17 15:03:40 2025

7.62K memory used for removed device mappings

config:

NAME STATE READ WRITE CKSUM

tank ONLINE 0 0 0

mirror-0 ONLINE 0 0 0

1ea9f584-fe96-4828-a048-045596a16cb9 ONLINE 0 0 0

e080a44c-e495-4cec-8d05-05b5802d4bf2 ONLINE 0 0 0

mirror-1 ONLINE 0 0 0

b06d42f8-2eda-4678-b677-54ed1c683378 ONLINE 0 0 0

e7574d9b-71a0-45d5-8b72-8d8fbf4b307b ONLINE 0 0 0

mirror-2 ONLINE 0 0 0

016eb050-41c1-4abf-9a8e-6b30209dedad ONLINE 0 0 2

58fa4281-e2bf-4cc2-9079-42194c03fd36 ONLINE 0 0 2

mirror-3 ONLINE 0 0 0

4879981d-e447-40b0-89a6-e828a78d31ca ONLINE 0 0 4

05e86aaf-f752-4a02-a404-35d0317d7a98 ONLINE 0 0 4

errors: Permanent errors have been detected in the following files:

/mnt/tank/media/movies/A Boy Named Charlie Brown (1969)/A Boy Named Charlie Brown (1969).mkv

/mnt/tank/media/movies/1917 (2019)/1917 (2019).mkv

Just as an FYI - I tried to copy one more item from the NAS to my local Windows desktop (taking dataset to dataset out of the equation) - immediately errored and now shows corrupt in the zpool status -v

Even just trying to play a media file will result in corruption. So I’m thinking no backups, don’t touch the files etc etc until hopefully someone on here can put me on a path. I may have just lost all my data sigh

Here’s a paste of the DMESG logs - I’m not sure how to read these. I tried to put them here but I guess Reddit has a limit on characters or something - so I dropped them on Pastebin. I did drop them into an ai to poke at for fun - it concluded as pasted below.

DMESG Logs on Pastebin

Conclusion

The dmesg logs reveal that your SAS controller (mpt3sas_cm0) is experiencing faults and resets, leading to I/O errors and command timeouts on multiple disks within your tank pool. This is the root cause of the data corruption you’re seeing. It’s likely a hardware issue with the HBA itself, the SAS cables, the drives, or even potentially the power supply to the drives.

Fleshmauler · August 21, 2025, 12:58am

Did you blacklist your HBA on proxmox so that it doesn’t randomly try to take control of your pool? Proxmox can use ZFS, and folks in the past have had critical pool failures when proxmox randomly decides that your pool is in its best interest vs Truenas’.

vexter0944 · August 21, 2025, 1:14am

I did not know I needed to do that - is there a way to see if that is occurring or a links to point me in a direction?

That seems kind of plausible. Look forward top your response. Thanks for reaching out, appreciate it!

vexter0944 · August 21, 2025, 1:16am

Reading on it now…

Fleshmauler · August 21, 2025, 1:21am

I wish I had specific documentation to provide, but I don’t. I just ended up making a second system specifically for proxmox when I realized I had more virtualization needs than truenas could handle & never wanted to virtualize my nas…

I simply know that this is a deadly pitfall & hope that this knowledge resolves your issues.

vexter0944 · August 21, 2025, 1:21am

it was not blacklisted - have done so and looks good now.

04:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 [1000:0097] (rev 02)
Subsystem: Broadcom / LSI SAS 9300-16i [1000:3130]
Kernel driver in use: vfio-pci
Kernel modules: mpt3sas

06:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS3008 PCI-Express Fusion-MPT SAS-3 [1000:0097] (rev 02)
Subsystem: Broadcom / LSI SAS 9300-16i [1000:3130]
Kernel driver in use: vfio-pci
Kernel modules: mpt3sas

Digging some more to confirm iommu settings (I know it’s enabled in the bios)
and making sure I have all the settings I need for the passthrough.

vexter0944 · August 21, 2025, 1:22am

No worries and thanks for putting me on a track as this is apparently needed.

Fleshmauler · August 21, 2025, 1:25am

Heads-up, it is possible that the data you’ve had in your pool since virtualization is suspect since HBA wasn’t blacklisted from the start. It may be worth investigating, though it is possible you’ve so far been lucky.

This is one of those awful things like port multipliers; it just works fine at first - and then eventually it doesn’t & everything is horrible.

vexter0944 · August 21, 2025, 1:50am

ok - so far looking good - here’s the changes I made.

confirmed IOMMU/VT-d/AMD-v enabled in bios (in my case a mpg z590 gaming force motherboard)
Identified the HBA driver - which was mpt3sas
Blacklisted the driver
Configured early binding of the HBA to the vfio-pci driver early in the boot process.
update-initramfs -u
rebooted
ran lspci -nn -k again and confirmed that Proxmox was no longer using the HBA
Then I made sure the passthroughs were right - I eneded up adding full function and pci-express on the SAS HBA

So far…I was able to copy a media file to my local desktop - the copy went through and then I was able to watch it - and so far the file I copied is not showing corrupt on the zpool status -v after I copied it.

I’ll keep going from here - but I ‘think’ we might have it. More testing to come. But 100% better than where I was.

Thanks a million @Fleshmauler for the idea/tip - I think you might have hit it on the head so far. crosses fingers

Whattteva · August 22, 2025, 3:07am

Glad you got on the right track, but let this be a lesson for you.

Unless you don’t care about the data, don’t try to do things you don’t fully understand without having a backup.

You got lucky, but many many many before you weren’t so lucky and lost their pools.

vexter0944 · August 22, 2025, 1:25pm

Agree and lesson learned. I certainly see your point. Thanks!

Topic		Replies	Views
Pool state "FAULTED" with "corrupted data", yet only one drive unavailable TrueNAS General SCALE , Hardware , ZFS	17	1748	April 30, 2024
HBA passthrough TrueNAS General SCALE , Hardware , ZFS	21	1965	December 21, 2024
Cannot Import a Exported ZFS HDD error EZFS_IO TrueNAS General CORE , ZFS , TrueNAS_as_VM , ProxMox , Import-problem	12	67	April 16, 2026
I need an adult. I really suck at this TrueNAS General SCALE , Hardware , ZFS	15	581	October 25, 2024
Persisting Checksum Error after File Deletion, Scrubbing TrueNAS General ZFS	7	120	February 19, 2026

Moved Truenas From Bare Metal to Proxmox VM - input/output errors and can't get to data

Related topics