Corrupted Python Package in boot-pool

Hi All,
Last month I installed Truenas Scale version 24.10.0 and have applied a couple of upgrades over the last couple of months 24.10.0 > 24.10.0.2 > 24.10.1

Recently I logged in and a notification popped up with a data corruption error in the Boot pool:

Boot pool status is ONLINE: One or more devices has experienced an error resulting in data corruption. Applications may be affected…

I ran a zpool status command on the boot-pool and it seems to be an issue with a python _pycache_ /test_SHA1.cpython-311.pyc file:

$ sudo zpool status boot-pool -v

  pool: boot-pool
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: scrub repaired 0B in 00:00:02 with 1 errors on Sun Dec 22 03:45:03 2024
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          vda3      ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        boot-pool/ROOT/24.10.1/usr@pristine:/lib/python3/dist-packages/Cryptodome/SelfTest/Hash/__pycache__/test_SHA1.cpython-311.pyc

I was thinking I might be able to copy the same file across from my other Truenas 24.10.1 install, however, when I try to remove the corrupt file first I am getting a read-only file system error:

$ sudo rm /lib/python3/dist-packages/Cryptodome/SelfTest/Hash/__pycache__/test_SHA1.cpython-311.pyc

rm: cannot remove '/lib/python3/dist-packages/Cryptodome/SelfTest/Hash/__pycache__/test_SHA1.cpython-311.pyc': Read-only file system

Would anyone have any suggestions on how I can restore/replace this file?

I think you have two options.

  1. Upgrade to the same version of TrueNAS - you might need to download and upload the upgrade file as the UI may not allow you to select it from a dropdown.

  2. Save your configuration file, re-install, import the configuration file.

1 Like

Thanks for the info @Protopia

Looks like option 1 won’t allow updating to the same running version already:

[EFAULT] You already are using 24.10.1

I’ll try the re-install if i get some time, otherwise I may just hold out for the next update/release.

We don’t have the hardware specs, but doesn’t this error suggest the boot drive itself is failing?

Is it by any chance a USB thumb / pen drive, which I understand are more susceptible to failure because of the more intense use of the boot pool by Scale?

1 Like

Yeah, specs would be good.
The device name “vda3” makes me think this is also virtualised.

That’s correct, it’s hosted in a Proxmox Virtual Environment.

The boot-pool is just a 32GB VirtIO hard drive sitting on the Proxmox “local-lvm” storage which is on a Samsung 980 Pro 500GB.

The data pools (mirror VDEVs) are all sitting on disks attached to the VM via a PCI-Express passthrough of a Broadcom LSI 9500-16i HBA.

I have opted to go with a re-install and have now added a couple of 512GB SSDs to the Broadcom HBA which I have installed the Truenas boot-pool as a mirror. So far so good, amazing how easy it is to restore the config and get up and running quickly!

Be sure to also blacklist the HBA in Proxmox, or else your pool might get corrupted by being mounted by Proxmox and TrueNAS at the same time.

1 Like