Pool is *and should be* degraded. Problem is that the correct degraded status is lost on system update

I have a raidz2 pool with one disk with errors and long overdue for replacement (I will get to it eventually…). This post is NOT on how to fix that. (The pool is extremely lightly used and it can go days between writes to it.)

The issue I have is that when I do a system update, for instance upgrading to 25.04.2.5 today, the degraded status is reset and when the system is restarted truenas says that the pool is healthy which is a lie, I know one of the disks has severe problems.

Before ugrade:

  pool: pool1
 state: DEGRADED
status: One or more devices are faulted in response to persistent errors.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Replace the faulted device, or use 'zpool clear' to mark the device
        repaired.
  scan: scrub repaired 1M in 03:59:19 with 0 errors on Mon Oct 20 05:13:20 2025
expand: expanded raidz2-0 copied 6.25T in 23:56:13, on Sat Jan  4 23:48:46 2025
config:

        NAME                                      STATE     READ WRITE CKSUM
        pool1                                     DEGRADED     0     0     0
          raidz2-0                                DEGRADED     0     0     0
            4b16822c-5dc5-4f87-a579-b9546fdc2ebb  ONLINE       0     0     0
            61412df7-d2e5-4d13-81d0-d5401a92eda7  ONLINE       0     0     0
            fd9b7e20-3a17-4e4c-b429-ac35e291a326  ONLINE       0     0     0
            ac38e0fe-51bf-48f9-9289-4a55d6423c1d  ONLINE       0     0     0
            61d33047-5d7f-4656-8fdd-7988fc934429  FAULTED     33     0     0  too many errors
            c1139412-93fa-4135-8a99-e134191c924b  ONLINE       0     0     0

After ugrade:

  pool: pool1
 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: resilvered 1.40G in 00:00:54 with 0 errors on Tue Oct 21 23:32:45 2025
expand: expanded raidz2-0 copied 6.25T in 23:56:13, on Sat Jan  4 23:48:46 2025
config:

        NAME                                      STATE     READ WRITE CKSUM
        pool1                                     ONLINE       0     0     0
          raidz2-0                                ONLINE       0     0     0
            4b16822c-5dc5-4f87-a579-b9546fdc2ebb  ONLINE       0     0     0
            61412df7-d2e5-4d13-81d0-d5401a92eda7  ONLINE       0     0     0
            fd9b7e20-3a17-4e4c-b429-ac35e291a326  ONLINE       0     0     0
            ac38e0fe-51bf-48f9-9289-4a55d6423c1d  ONLINE       0     0     0
            61d33047-5d7f-4656-8fdd-7988fc934429  ONLINE       0     0     0
            c1139412-93fa-4135-8a99-e134191c924b  ONLINE       0     0     0

To me this seems like a bug that ought to be fixed, right?

If not mistaken these are zfs errors & until zfs runs a checksum it won’t see said errors & won’t put the pool into a degraded status.

You can manually run a scrub yourself to get system to flag things, or just wait. However a reboot or a clear command would put you back in a ‘working’ state.

Does it happen only on upgrades, or does it happen on any reboot?

I will check when it has degraded again.

I’m sure it happens on any reboot, because that’s what ZFS does–clears errors on reboot.

3 Likes