FreeNAS-11.3-U5 multi-drive RAIDZ2 failure recovery

Good day all, I’ve been lurking on these forums since everything was still FreeNAS. For the last 5 years my pool worked great of 8x8TB HGST SAS drives, with about 1-2 bad drives per year.

Late last year and this year however, things took a turn for the worst. I experienced multiple failures. Then, when finally resilvering with new drives, a bad neutral in my complex’s electrical circuit caused the attached UPS battery backup to turn off mid-resilver from the overvolting. That has since been addressed with multiple visits from the electric company, yet I still can’t seem to rebuild the pool to a healthy state.

There were about 1,500 small files that I had a backup for that needed removed, with 1 single folder unable to be deleted still for reasons unknown. There’s also a number of what appear to be FreeNAS system files on the list when doing ‘zpool status -v’. Even after trying to replace the 2nd failed drive, this condition persists, and there’s what appears to be an erroneous condition on the GUI pool status (sorry, I’d post a screenshot of this but the forum rules state I cannot until I’m no longer deemed to be Spam):

Pool Status

Refresh

RESILVER

Status: FINISHED

Errors: 235315

Date: 3/26/2025, 10:39:12 AM

Name Read Write Checksum Status
LCARS 0 0 0 DEGRADED
RAIDZ2 0 0 474900 DEGRADED
da6p2 0 0 0 DEGRADED more_vert
da8p2 0 0 0 DEGRADED more_vert
REPLACING 0 0 0 DEGRADED
/dev/gptid/78563cfc-b8e7-11ea-a9c4-000c293859a3 0 0 0 UNAVAIL more_vert
da4p2 0 0 0 ONLINE more_vert
da5p2 0 0 0 DEGRADED more_vert
da7p2 0 0 0 ONLINE more_vert
REPLACING 0 0 0 DEGRADED
/dev/gptid/228aaf0a-d2c6-11ef-8d83-000c293859a3 0 0 0 UNAVAIL more_vert
da3p2 0 0 0 ONLINE more_vert
da2p2 0 0 0 DEGRADED more_vert
da1p2 0 0 0 DEGRADED more_vert
log
/dev/ada1p1 0 0 0 ONLINE more_vert
cache
ada0p1 0 0 0 ONLINE more_vert

Here’s the zpool status verbiage:

Warning: settings changed through the CLI are not written to
the configuration database and will be reset on reboot.

root@freenas[~]# zpool status -v
pool: LCARS
state: DEGRADED
status: One or more devices has experienced an error resulting in data
corruption. Applications may be affected.
action: Restore the file in question if possible. Otherwise restore the
entire pool from backup.
see:
scan: resilvered 78.8G in 0 days 17:39:15 with 235315 errors on Thu Mar 27 04:18:27 2025
config:

    NAME                                              STATE     READ WRITE CKSUM
    LCARS                                             DEGRADED     0     0231K
      raidz2-0                                        DEGRADED     0     0463K
        gptid/77a1f134-b8e7-11ea-a9c4-000c293859a3    DEGRADED     0     0   0  too many errors
        gptid/f13b673b-cddb-11ef-8d83-000c293859a3    DEGRADED     0     0   0  too many errors
        replacing-2                                   DEGRADED     0     0   0
          828688325357713696                          UNAVAIL      0     0   0  was /dev/gptid/78563cfc-b8e7-11ea-a9c4-000c293859a3
          gptid/36583b63-0491-11f0-b70a-000c293859a3  ONLINE       0     0   0
        gptid/78c869f7-b8e7-11ea-a9c4-000c293859a3    DEGRADED     0     0   0  too many errors
        gptid/e93d987b-90e5-11ee-a86f-000c293859a3    ONLINE       0     0   0
        replacing-5                                   DEGRADED     0     0   0
          10017531497728148374                        UNAVAIL      0     0   0  was /dev/gptid/228aaf0a-d2c6-11ef-8d83-000c293859a3
          gptid/3ebee93e-fbf0-11ef-a4fc-000c293859a3  ONLINE       0     0   0
        gptid/78dc6496-b8e7-11ea-a9c4-000c293859a3    DEGRADED     0     0   0  too many errors
        gptid/1a847666-8f6c-11ee-af0f-000c293859a3    DEGRADED     0     0   0  too many errors
    logs
      ada1p1                                          ONLINE       0     0   0
    cache
      gptid/b1d3a3d6-b6c8-11eb-a7a6-000c293859a3      ONLINE       0     0   0

errors: Permanent errors have been detected in the following files:

    LCARS:<0x0>
    LCARS:<0x153101>
    LCARS:<0x153002>
    LCARS:<0x153104>
    LCARS:<0x153005>
    LCARS:<0x153107>
    LCARS:<0x153008>
    LCARS:<0x15310a>
    LCARS:<0x15300b>
    LCARS:<0x15310d>
    LCARS:<0x15300e>
    LCARS:<0x153110>
    LCARS:<0x153011>
    LCARS:<0x153113>
    LCARS:<0x153014>
    LCARS:<0x153116>
    LCARS:<0x152a17>
    LCARS:<0x153017>
    LCARS:<0x153119>
    LCARS:<0x15301a>
    LCARS:<0x15311c>
    LCARS:<0x15301d>
    LCARS:<0x15311f>
    LCARS:<0x153020>
    LCARS:<0x153122>
    LCARS:<0x153023>
    LCARS:<0x152124>
    LCARS:<0x152125>
    LCARS:<0x153125>
    LCARS:<0x152126>
    LCARS:<0x153026>
    LCARS:<0x153128>
    LCARS:<0x152129>
    LCARS:<0x153029>
    LCARS:<0x15212a>
    LCARS:<0x15212b>
    LCARS:<0x15312b>
    LCARS:<0x15302c>
    LCARS:<0x15312e>
    LCARS:<0x15212f>
    LCARS:<0x15302f>
    LCARS:<0x153131>
    LCARS:<0x152132>
    LCARS:<0x153032>
    LCARS:<0x152133>
    LCARS:<0x152134>
    LCARS:<0x153134>
    /mnt/LCARS/username/2025/Photos
    LCARS:<0x152135>
    LCARS:<0x153035>
    LCARS:<0x152136>
    LCARS:<0x152a37>
    LCARS:<0x153137>
    LCARS:<0x152138>
    LCARS:<0x153038>
    LCARS:<0x15213a>
    LCARS:<0x15313a>
    LCARS:<0x15213b>
    LCARS:<0x15303b>
    LCARS:<0x15213c>
    LCARS:<0x15213d>
    LCARS:<0x15313d>
    LCARS:<0x15213e>
    LCARS:<0x15303e>
    LCARS:<0x153140>
    LCARS:<0x152141>
    LCARS:<0x153041>
    LCARS:<0x152143>
    LCARS:<0x153143>
    LCARS:<0x153044>
    LCARS:<0x152145>
    LCARS:<0x153146>
    LCARS:<0x153047>
    LCARS:<0x153149>
    LCARS:<0x15214a>
    LCARS:<0x15304a>
    LCARS:<0x15314c>
    LCARS:<0x15214d>
    LCARS:<0x15304d>
    LCARS:<0x15214f>
    LCARS:<0x15314f>
    LCARS:<0x153050>
    LCARS:<0x153152>
    LCARS:<0x153053>
    LCARS:<0x153155>
    LCARS:<0x153056>
    LCARS:<0x153158>
    LCARS:<0x153059>
    LCARS:<0x15315b>
    LCARS:<0x15305c>
    LCARS:<0x15315e>
    LCARS:<0x15305f>
    LCARS:<0x153161>
    LCARS:<0x153062>
    LCARS:<0x153164>
    LCARS:<0x153065>
    LCARS:<0x153167>
    LCARS:<0x153068>
    LCARS:<0x15316a>
    LCARS:<0x15306b>
    LCARS:<0x15316d>
    LCARS:<0x15306e>
    LCARS:<0x152f6f>
    LCARS:<0x153170>
    LCARS:<0x153071>
    LCARS:<0x151f72>
    LCARS:<0x152f72>
    LCARS:<0x153173>
    LCARS:<0x153074>
    LCARS:<0x152f75>
    LCARS:<0x153176>
    LCARS:<0x153077>
    LCARS:<0x152f78>
    LCARS:<0x152079>
    LCARS:<0x153179>
    LCARS:<0x15307a>
    LCARS:<0x152f7b>
    LCARS:<0x15317c>
    LCARS:<0x15307d>
    LCARS:<0x152f7e>
    LCARS:<0x15317f>
    LCARS:<0x151f80>
    LCARS:<0x153080>
    LCARS:<0x152f81>
    LCARS:<0x153182>
    LCARS:<0x153083>
    LCARS:<0x152f84>
    LCARS:<0x153185>
    LCARS:<0x153086>
    LCARS:<0x152087>
    LCARS:<0x152f87>
    LCARS:<0x153188>
    LCARS:<0x153089>
    LCARS:<0x152f8a>
    LCARS:<0x15318b>
    LCARS:<0x151f8c>
    LCARS:<0x15308c>
    LCARS:<0x152f8d>
    LCARS:<0x15208e>
    LCARS:<0x15318e>
    LCARS:<0x15308f>
    LCARS:<0x152f90>
    LCARS:<0x153191>
    LCARS:<0x153092>
    LCARS:<0x152f93>
    LCARS:<0x153194>
    LCARS:<0x153095>
    LCARS:<0x152096>
    LCARS:<0x152f96>
    LCARS:<0x153197>
    LCARS:<0x153098>
    LCARS:<0x152099>
    LCARS:<0x152f99>
    LCARS:<0x15319a>
    LCARS:<0x15309b>
    LCARS:<0x152f9c>
    LCARS:<0x15319d>
    LCARS:<0x15209e>
    LCARS:<0x15309e>
    LCARS:<0x152f9f>
    LCARS:<0x1531a0>
    LCARS:<0x1520a1>
    LCARS:<0x1530a1>
    LCARS:<0x152fa2>
    LCARS:<0x1531a3>
    LCARS:<0x1520a4>
    LCARS:<0x1530a4>
    LCARS:<0x152fa5>
    LCARS:<0x1531a6>
    LCARS:<0x1530a7>
    LCARS:<0x152fa8>
    LCARS:<0x1531a9>
    LCARS:<0x1530aa>
    LCARS:<0x152fab>
    LCARS:<0x1531ac>
    LCARS:<0x1530ad>
    LCARS:<0x152fae>
    LCARS:<0x1520af>
    LCARS:<0x1531af>
    LCARS:<0x1520b0>
    LCARS:<0x1530b0>
    LCARS:<0x152fb1>
    LCARS:<0x1531b2>
    LCARS:<0x1530b3>
    LCARS:<0x1520b4>
    LCARS:<0x152fb4>
    LCARS:<0x1531b5>
    LCARS:<0x1530b6>
    LCARS:<0x152fb7>
    LCARS:<0x1531b8>
    LCARS:<0x1520b9>
    LCARS:<0x1530b9>
    LCARS:<0x152fba>
    LCARS:<0x1531bb>
    LCARS:<0x1530bc>
    LCARS:<0x152fbd>
    LCARS:<0x1531be>
    LCARS:<0x1530bf>
    LCARS:<0x152fc0>
    LCARS:<0x1531c1>
    LCARS:<0x1530c2>
    LCARS:<0x152fc3>
    LCARS:<0x1531c4>
    LCARS:<0x1530c5>
    LCARS:<0x152fc6>
    LCARS:<0x1531c7>
    LCARS:<0x1530c8>
    LCARS:<0x152fc9>
    LCARS:<0x1531ca>
    LCARS:<0x1520cb>
    LCARS:<0x1530cb>
    LCARS:<0x152fcc>
    LCARS:<0x1531cd>
    LCARS:<0x1520ce>
    LCARS:<0x1530ce>
    LCARS:<0x152fcf>
    LCARS:<0x1531d0>
    LCARS:<0x1530d1>
    LCARS:<0x152fd2>
    LCARS:<0x1520d3>
    LCARS:<0x1531d3>
    LCARS:<0x1530d4>
    LCARS:<0x1520d5>
    LCARS:<0x152fd5>
    LCARS:<0x1531d6>
    LCARS:<0x1530d7>
    LCARS:<0x1520d8>
    LCARS:<0x152fd8>
    LCARS:<0x1520d9>
    LCARS:<0x1531d9>
    LCARS:<0x1530da>
    LCARS:<0x1520db>
    LCARS:<0x152fdb>
    LCARS:<0x1520dc>
    LCARS:<0x1531dc>
    LCARS:<0x1530dd>
    LCARS:<0x152fde>
    LCARS:<0x1531df>
    LCARS:<0x1530e0>
    LCARS:<0x152fe1>
    LCARS:<0x1530e3>
    LCARS:<0x152fe4>
    LCARS:<0x1530e6>
    LCARS:<0x152fe7>
    LCARS:<0x1530e9>
    LCARS:<0x152fea>
    LCARS:<0x1530ec>
    LCARS:<0x152fed>
    LCARS:<0x1530ef>
    LCARS:<0x152ff0>
    LCARS:<0x1520f2>
    LCARS:<0x1530f2>
    LCARS:<0x152ff3>
    LCARS:<0x1520f5>
    LCARS:<0x1530f5>
    LCARS:<0x152ff6>
    LCARS:<0x1530f8>
    LCARS:<0x152ff9>
    LCARS:<0x1520fb>
    LCARS:<0x1530fb>
    LCARS:<0x152ffc>
    LCARS:<0x1520fd>
    LCARS:<0x1530fe>
    LCARS:<0x1520ff>
    LCARS:<0x152fff>
    /var/db/system/syslog-646f8dae97d646cc8946ddeb0ca79d97/log/samba4/log.smbd
    /var/db/system/syslog-646f8dae97d646cc8946ddeb0ca79d97/log/messages.3
    /var/db/system/syslog-646f8dae97d646cc8946ddeb0ca79d97/log/messages.7.bz2
    /var/db/system/syslog-646f8dae97d646cc8946ddeb0ca79d97/log/middlewared.log
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250124.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250125.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250126.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250127.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250128.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250129.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250130.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250131.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250201.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250202.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250203.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250204.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250205.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250206.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250207.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250208.db
    /var/db/system/configs-646f8dae97d646cc8946ddeb0ca79d97/FreeNAS-11.3-U5/20250209.db

pool: freenas-boot
state: ONLINE
scan: scrub repaired 0 in 0 days 00:00:13 with 0 errors on Tue Mar 25 03:45:13 2025

I can access the data on it just fine at present, but am worried if another drive fails and things get worse, that might change, and want to be proactive.

Any ideas how I can make this pool healthy again? Do I just need to throw in the towel and re-do everything?

You can’t: There are irrecoverable errors in ZFS metadata.
Destroy and restore to a new pool from backup.

1 Like

Thanks for the guidance on this. I have completed backing up all 23 terabytes of data over the last month of time just in case my existing backups were missing any new data.

In between doing that, I replaced 3 of the 8 installed 8TB SAS drives that were showing bad blocks or SMART errors.

After enduring repeated resilvers auto-triggered each time I replaced those drives, and again each time the local utility company again over-volted my home’s electric grid, today I tried to disconnect / export the pool.

Even after disconnecting shared drives and stopping all sharing services, the error continues to present itself “Device busy” when disconnecting:

Error: Traceback (most recent call last):
File “/usr/local/lib/python3.7/site-packages/middlewared/job.py”, line 349, in run
await self.future
File “/usr/local/lib/python3.7/site-packages/middlewared/job.py”, line 385, in __run_body
rv = await self.method(*([self] + args))
File “/usr/local/lib/python3.7/site-packages/middlewared/schema.py”, line 961, in nf
return await f(*args, **kwargs)
File “/usr/local/lib/python3.7/site-packages/middlewared/plugins/pool.py”, line 2231, in export
raise CallError(sysds_job.error)
middlewared.service_exception.CallError: [EFAULT] [Errno 16] Device busy: ‘/tmp/system.new’

Any idea how I get past this to finally disconnect & destroy, then recreate the pool?

I presume your zpool is still trying to resilver is that correct?

If so and you are sure that you want to destroy the pool then you could try exporting the pool from the shell with zpool export LCARS or failing that zpool export -f LCARS.

Looks to be all set now, thanks again!!!

1 Like