Cross posting my thread on r/truenas since no one has responded yet. I sure hope someone will have some intput/help - I’m lost big time.
I had truenas scale 24 on a standalone server (intel i3-7100T, 64 gb ecc and 8x12tb WD drives with a HBA).
I’ve been working towards a new setup and pulled the trigger and moved all the drives from the old server to a new one that is a VM under proxmox (i10 intel cpu with 128gb of ram for Proxmox - gave this 4 cpu and 32gb of ram and passed the new hba into the proxmox vm)
I did an export/disconnect of the drives on the old truenas server and then imported the pool and restored settings from back on the new VM (also scale 24) and everything came back as expected - no issues that I could see.
I then upgraded from scale 24 to 25 and that went well - no issues.
I then added 4 new drives to the pool (2 vdevs of 2x8gb drives) so I could expand my pool.
Then I got some kind of issue/warning about mismatched drive sizes in the pool inside of Truenas, so I pulled those drives from the tank pool and left the original 8x12 just like they were on the old server.
Then I started to notice some issues - I could not and still cannot copy files from the old pool to the new pool without some kind of issue. So I ran a scrub, saw the issues noted in pool status -v - I replaced any files that were noted as corrupted. Re-ran another scrub, which just finished and showed no known errors when I looked at it.
So I tried to copy a movie file from the old movie pool to the new movie pool (the new pool with 4x8gb drives) and I get an error from the shell using cp of “input/output” error and it will not copy the file. It creates the folder on the new pool, but then fails with this input/output error. And now if I do a zpool status -v any file I tried to copy now has permanent corruption. I don’t under stand why this is happening when a full scrub shows gtg right after it’s done and now trying to access a file and copying it makes it corrupt. Including last zpool status -v output. I sure hope someone can help out - I’m puzzled!
pool: tank
state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see: openzfs-docs/msg/ZFS-8000-8A
scan: scrub repaired 0B in 18:17:41 with 0 errors on Wed Aug 20 09:41:41 2025
remove: Removal of vdev 4 copied 826M in 0h0m, completed on Sun Aug 17 15:03:40 2025
7.62K memory used for removed device mappings
config:
NAME STATE READ WRITE CKSUM
tank ONLINE 0 0 0
mirror-0 ONLINE 0 0 0
1ea9f584-fe96-4828-a048-045596a16cb9 ONLINE 0 0 0
e080a44c-e495-4cec-8d05-05b5802d4bf2 ONLINE 0 0 0
mirror-1 ONLINE 0 0 0
b06d42f8-2eda-4678-b677-54ed1c683378 ONLINE 0 0 0
e7574d9b-71a0-45d5-8b72-8d8fbf4b307b ONLINE 0 0 0
mirror-2 ONLINE 0 0 0
016eb050-41c1-4abf-9a8e-6b30209dedad ONLINE 0 0 2
58fa4281-e2bf-4cc2-9079-42194c03fd36 ONLINE 0 0 2
mirror-3 ONLINE 0 0 0
4879981d-e447-40b0-89a6-e828a78d31ca ONLINE 0 0 4
05e86aaf-f752-4a02-a404-35d0317d7a98 ONLINE 0 0 4
errors: Permanent errors have been detected in the following files:
/mnt/tank/media/movies/A Boy Named Charlie Brown (1969)/A Boy Named Charlie Brown (1969).mkv
/mnt/tank/media/movies/1917 (2019)/1917 (2019).mkv
Just as an FYI - I tried to copy one more item from the NAS to my local Windows desktop (taking dataset to dataset out of the equation) - immediately errored and now shows corrupt in the zpool status -v
Even just trying to play a media file will result in corruption. So I’m thinking no backups, don’t touch the files etc etc until hopefully someone on here can put me on a path. I may have just lost all my data sigh
Here’s a paste of the DMESG logs - I’m not sure how to read these. I tried to put them here but I guess Reddit has a limit on characters or something - so I dropped them on Pastebin. I did drop them into an ai to poke at for fun - it concluded as pasted below.
DMESG Logs on Pastebin
<ask me for logs on pastebin - can’t post it here apparently>
Conclusion
The dmesg logs reveal that your SAS controller (mpt3sas_cm0) is experiencing faults and resets, leading to I/O errors and command timeouts on multiple disks within your tank pool. This is the root cause of the data corruption you’re seeing. It’s likely a hardware issue with the HBA itself, the SAS cables, the drives, or even potentially the power supply to the drives.