RAIDZ1 pool won’t import after force spindown command; one vdev stuck OFFLINE, two labels unreadable; need guidance to restore availability

Leo_Zendo · September 25, 2025, 8:39am

Digest mostly generated by llm, but proofread and accurate

Environment

TrueNAS SCALE: 25.04.x (kernel 6.12.15-production+truenas)
Original HBA: LSI/Broadcom MegaRAID SAS-3 3108 (megaraid_sas, IR/JBOD mode)
Current HBA: IT-mode controller on a fresh host (clean passthrough)
Pool p6: RAIDZ1 of 4× 6TB SAS (Seagate ST6000NM0095)
- by-partuuid → device map on current host:
  - ad8810bb-bb67-49c3-bfc8-e2bf1749d1c3 → sdc1
  - 92a1436f-fa72-4f84-837e-56baf0c5c966 → sde1
  - fc7b8c75-3044-45e8-9cd3-64b4e13b456a → sdf1
  - 39b0f028-396a-4ca7-8ebe-c3652d53ccc0 → sda1

What happened (chronological, 5 minutes span)

I was reducing power draw and attempted to spin down all p6 drives.
- Pool busy, not allowing spindown
Forced spindown
- Ran: zpool offline p6 ad8810bb-bb67-49c3-bfc8-e2bf1749d1c3 → SUCCESS (pool went DEGRADED).
- One of the 4 drives offlined sucsessfully, remaining 3 denied offline - and remained working because of raidz1 requires 3 drives
- Attempts to offline the other 3 failed with “no valid replicas”.
I GUI-exported the pool which succeeded with no error, then rebooted.
After reboot, pool import failed with “insufficient replicas.”
On the old system (IR controller), I saw repeated DID_BAD_TARGET errors; I then migrated all four disks to an IT-mode host for clean passthrough.

Current symptoms (IT-mode host)

Import discovery:

zpool import -d /dev/disk/by-partuuid
  pool: p6
    id: 814428843280364485
 state: UNAVAIL
status: One or more devices are faulted.
action: The pool cannot be imported due to damaged devices or data.
config:
        p6                                    UNAVAIL  insufficient replicas
          raidz1-0                            UNAVAIL  insufficient replicas
            ad8810bb-... (sdc1)               OFFLINE
            92a1436f-... (sde1)               ONLINE
            fc7b8c75-... (sdf1)               ONLINE
            39b0f028-... (sda1)               FAULTED  corrupted data

Labels/uberblocks:

# Good members (examples)
zdb -l /dev/sde1 | head -n 40
# state: 1, txg: 7505316, shows children; ad8810bb child has offline: 1

zdb -l /dev/sdf1 | head -n 40
# Similar; ad8810bb shows offline: 1

# Previously OFFLINE member (ad8810bb → sdc1)
zdb -l /dev/sdc1
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3

# Previously FAULTED member (39b0f028 → sda1)
zdb -l /dev/sda1
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3

SMART on the old system showed drives healthy; today the issue is metadata/labels, not mechanical failure.

What I tried (non-destructive)

All combinations of:
- zpool import -d /dev[/disk/by-id|/disk/by-partuuid|/tmp/symlinks] -f -N -F -X [p6|<guid>]
- zpool import -f -N -o readonly=on, with and without explicit -T <txg>
Targeted import with only the 3 “best” devices (symlink dir excluding the FAULTED one) → still “insufficient replicas.”
zpool labelclear on the FAULTED member (on IR system it was sdh1; on new host it’s sda1) → often refused or labels already unreadable.
Verified that the “offline: 1” is present in the child nvlist within labels on the good members, but the OFFLINE device’s own labels are unreadable.

My understanding

The vdev OFFLINE flag is stored in the on-disk pool metadata (labels/MOS). It can only be cleared via zpool online after a successful import.
At present I only have 2/4 members with readable labels (sde1, sdf1). The OFFLINE member’s own labels (sdc1) are unreadable, and the fourth (sda1) also has bad labels. So import fails for RAIDZ1.
However with the single disk connected to a fresh new system, OFFLINE state is still readable - just no clear way to write.

What I’m asking help with

Best-practice recovery path to regain a 3-of-4 readable set so I can import degraded and immediately zpool online the previously OFFLINE device.
It is unlikly there is physical damage or bad data written durning the few seconds between 1 disk offline and all 4 disks offline.
There was no task activly run on the pool at the time of incident.
Specifically:
- Is it advisable to ddrescue the previously OFFLINE disk (sdc) to a same-size spare to try to recover its labels, then retry import with the clone?
- Is there any safe way to force ZFS to ignore the bad member and consider the OFFLINE device present (i.e., make the set 3-of-4) long enough to import read-only and flip it online?
- Recommended order of operations and flags for import rewind when one child shows offline: 1 in parents’ labels but its own labels are unreadable.

Constraints and goals

Goal: restore p6 availability with minimal or no data loss by importing degraded and resilvering.
I can attach the disks to either the new IT-mode host or temporarily to the original system if that helps recovery, but prefer IT-mode.
I will avoid any destructive actions on the two bad-label members until advised; I can clone to new drives if that’s the right first step - I have two spares.

Most relevant snippets (for quick reference)

# Shows OFFLINE in child nvlist on good members
zdb -l /dev/sde1 | sed -n '1,120p'
# ... children[0] ad8810bb ... offline: 1

# Unreadable labels on ad8810bb (sdc1) and 39b0f028 (sda1)
zdb -l /dev/sdc1   # all four labels: failed to unpack
zdb -l /dev/sda1   # all four labels: failed to unpack

Thank you for any expert guidance (order of ddrescue, label repair, import flags, or proven procedures) to get this pool imported and start a resilver.

TheColin21 · September 25, 2025, 8:42am

You might’ve wanted to adjust that after copying from ChatGPT (not that the AI generated text wouldn’t have been obvious).

TheColin21 · September 25, 2025, 9:13am

We might have to wait for a real ZFS expert to show up for this situation but could you post some more information in the meantime? (complete HW information, SMART data of all disks (smartctl -x /dev/sdX)).

Leo_Zendo · September 25, 2025, 9:37am

Thanks! Here’s full setup(fresh install of truenas on a Dell mini PC) HW info and SMART -x for all 6TB members of pool p6.

Issue recap: after offlining one vdev to spin down, export + reboot, the pool won’t import (UNAVAIL/insufficient replicas). Two members have readable labels (sde1, sdf1); two have unreadable labels (sdc1, sda1). The OFFLINE flag for ad8810bb… shows in the parent labels, but its own labels are unreadable, so I can’t import to run zpool online. Disks appear physically healthy by SMART; looking for guidance to regain 3/4 so I can import degraded and resilver.

Linux truenas 6.12.15-production+truenas #1 SMP PREEMPT_DYNAMIC Mon May 26 13:44:31 UTC 2025 x86_64

        TrueNAS (c) 2009-2025, iXsystems, Inc. dba TrueNAS
        All rights reserved.
        TrueNAS code is released under the LGPLv3 and GPLv3 licenses with some
        source files copyrighted by (c) iXsystems, Inc. All other components
        are released under their own respective licenses.

        For more information, documentation, help or support, go here:
        http://truenas.com

Warning: the supported mechanisms for making configuration changes
are the TrueNAS WebUI, CLI, and API exclusively. ALL OTHERS ARE
NOT SUPPORTED AND WILL RESULT IN UNDEFINED BEHAVIOR AND MAY
RESULT IN SYSTEM FAILURE.

Welcome to TrueNAS
Last login: Thu Sep 25 02:19:04 PDT 2025 on pts/2

root@truenas[~]# uname -a
midclt call system.info
lspci -nn
lsblk -e7 -o NAME,SIZE,ROTA,TYPE,SERIAL,MODEL,TRAN
ls -l /dev/disk/by-id
ls -l /dev/disk/by-partuuid
Linux truenas 6.12.15-production+truenas #1 SMP PREEMPT_DYNAMIC Mon May 26 13:44:31 UTC 2025 x86_64 GNU/Linux
{"version": "25.04.1", "buildtime": {"$date": 1748237893000}, "hostname": "truenas", "physmem": 8255094784, "model": "Intel(R) Core(TM) i7-6700 CPU @ 3.40GHz", "cores": 8, "physical_cores": 4, "loadavg": [0.0, 0.0146484375, 0.0], "uptime": "1:34:33.754188", "uptime_seconds": 5673.754188198, "system_serial": "9MSFJB2", "system_product": "Precision Tower 3420", "system_product_version": "Not Specified", "license": null, "boottime": {"$date": 1758787049000}, "datetime": {"$date": 1758792723000}, "timezone": "America/Los_Angeles", "system_manufacturer": "Dell Inc.", "ecc_memory": false}
00:00.0 Host bridge [0600]: Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers [8086:191f] (rev 07)
00:01.0 PCI bridge [0604]: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) [8086:1901] (rev 07)
00:14.0 USB controller [0c03]: Intel Corporation 100 Series/C230 Series Chipset Family USB 3.0 xHCI Controller [8086:a12f] (rev 31)
00:14.2 Signal processing controller [1180]: Intel Corporation 100 Series/C230 Series Chipset Family Thermal Subsystem [8086:a131] (rev 31)
00:16.0 Communication controller [0780]: Intel Corporation 100 Series/C230 Series Chipset Family MEI Controller #1 [8086:a13a] (rev 31)
00:17.0 SATA controller [0106]: Intel Corporation Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] [8086:a102] (rev 31)
00:1c.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #6 [8086:a115] (rev f1)
00:1d.0 PCI bridge [0604]: Intel Corporation 100 Series/C230 Series Chipset Family PCI Express Root Port #9 [8086:a118] (rev f1)
00:1f.0 ISA bridge [0601]: Intel Corporation C236 Chipset LPC/eSPI Controller [8086:a149] (rev 31)
00:1f.2 Memory controller [0580]: Intel Corporation 100 Series/C230 Series Chipset Family Power Management Controller [8086:a121] (rev 31)
00:1f.3 Audio device [0403]: Intel Corporation 100 Series/C230 Series Chipset Family HD Audio Controller [8086:a170] (rev 31)
00:1f.4 SMBus [0c05]: Intel Corporation 100 Series/C230 Series Chipset Family SMBus [8086:a123] (rev 31)
00:1f.6 Ethernet controller [0200]: Intel Corporation Ethernet Connection (2) I219-LM [8086:15b7] (rev 31)
01:00.0 Serial Attached SCSI controller [0107]: Broadcom / LSI SAS2308 PCI-Express Fusion-MPT SAS-2 [1000:0087] (rev 05)
02:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS525A PCI Express Card Reader [10ec:525a] (rev 01)
03:00.0 VGA compatible controller [0300]: NVIDIA Corporation GM107GL [Quadro K620] [10de:13bb] (rev a2)
03:00.1 Audio device [0403]: NVIDIA Corporation GM107 High Definition Audio Controller [GeForce 940MX] [10de:0fbc] (rev a1)
NAME     SIZE ROTA TYPE SERIAL               MODEL                       TRAN
sda      5.5T    1 disk ZAD3JXA30000C8262SJP ST6000NM0095                sas
└─sda1   5.5T    1 part                                                  
sdb    238.5G    0 disk S0TZNYAG106325       SAMSUNG SSD PM830 2.5 256GB sata
├─sdb1     1M    0 part                                                  
├─sdb2   512M    0 part                                                  
└─sdb3   238G    0 part                                                  
sdc      5.5T    1 disk ZAD3JY190000C8262QF1 ST6000NM0095                sas
└─sdc1   5.5T    1 part                                                  
sdd    238.5G    0 disk S0TZNYAG106451       SAMSUNG SSD PM830 2.5 256GB sata
├─sdd1     1M    0 part                                                  
├─sdd2   512M    0 part                                                  
└─sdd3   238G    0 part                                                  
sde      5.5T    1 disk ZAD3K8QF0000C8264Z11 ST6000NM0095                sas
└─sde1   5.5T    1 part                                                  
sdf      5.5T    1 disk ZAD3JXJQ0000C8262RAP ST6000NM0095                sas
└─sdf1   5.5T    1 part                                                  
total 0
lrwxrwxrwx 1 root root  9 Sep 24 21:00 ata-SAMSUNG_SSD_PM830_2.5_256GB_S0TZNYAG106325 -> ../../sdb
lrwxrwxrwx 1 root root 10 Sep 24 21:00 ata-SAMSUNG_SSD_PM830_2.5_256GB_S0TZNYAG106325-part1 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Sep 24 21:00 ata-SAMSUNG_SSD_PM830_2.5_256GB_S0TZNYAG106325-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Sep 24 21:00 ata-SAMSUNG_SSD_PM830_2.5_256GB_S0TZNYAG106325-part3 -> ../../sdb3
lrwxrwxrwx 1 root root  9 Sep 24 21:00 ata-SAMSUNG_SSD_PM830_2.5_256GB_S0TZNYAG106451 -> ../../sdd
lrwxrwxrwx 1 root root 10 Sep 24 21:00 ata-SAMSUNG_SSD_PM830_2.5_256GB_S0TZNYAG106451-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 Sep 24 21:00 ata-SAMSUNG_SSD_PM830_2.5_256GB_S0TZNYAG106451-part2 -> ../../sdd2
lrwxrwxrwx 1 root root 10 Sep 24 21:00 ata-SAMSUNG_SSD_PM830_2.5_256GB_S0TZNYAG106451-part3 -> ../../sdd3
lrwxrwxrwx 1 root root  9 Sep 25 01:02 scsi-35000c5009520c4bf -> ../../sdc
lrwxrwxrwx 1 root root 10 Sep 25 01:02 scsi-35000c5009520c4bf-part1 -> ../../sdc1
lrwxrwxrwx 1 root root  9 Sep 25 01:03 scsi-35000c5009520d537 -> ../../sdf
lrwxrwxrwx 1 root root 10 Sep 25 01:03 scsi-35000c5009520d537-part1 -> ../../sdf1
lrwxrwxrwx 1 root root  9 Sep 24 21:00 scsi-35000c5009520e453 -> ../../sda
lrwxrwxrwx 1 root root 10 Sep 25 01:09 scsi-35000c5009520e453-part1 -> ../../sda1
lrwxrwxrwx 1 root root  9 Sep 25 01:02 scsi-35000c50095213f23 -> ../../sde
lrwxrwxrwx 1 root root 10 Sep 25 01:02 scsi-35000c50095213f23-part1 -> ../../sde1
lrwxrwxrwx 1 root root  9 Sep 25 01:02 wwn-0x5000c5009520c4bf -> ../../sdc
lrwxrwxrwx 1 root root 10 Sep 25 01:02 wwn-0x5000c5009520c4bf-part1 -> ../../sdc1
lrwxrwxrwx 1 root root  9 Sep 25 01:03 wwn-0x5000c5009520d537 -> ../../sdf
lrwxrwxrwx 1 root root 10 Sep 25 01:03 wwn-0x5000c5009520d537-part1 -> ../../sdf1
lrwxrwxrwx 1 root root  9 Sep 24 21:00 wwn-0x5000c5009520e453 -> ../../sda
lrwxrwxrwx 1 root root 10 Sep 25 01:09 wwn-0x5000c5009520e453-part1 -> ../../sda1
lrwxrwxrwx 1 root root  9 Sep 25 01:02 wwn-0x5000c50095213f23 -> ../../sde
lrwxrwxrwx 1 root root 10 Sep 25 01:02 wwn-0x5000c50095213f23-part1 -> ../../sde1
lrwxrwxrwx 1 root root  9 Sep 24 21:00 wwn-0x5002538043584d30 -> ../../sdb
lrwxrwxrwx 1 root root 10 Sep 24 21:00 wwn-0x5002538043584d30-part1 -> ../../sdd1
lrwxrwxrwx 1 root root 10 Sep 24 21:00 wwn-0x5002538043584d30-part2 -> ../../sdb2
lrwxrwxrwx 1 root root 10 Sep 24 21:00 wwn-0x5002538043584d30-part3 -> ../../sdd3
total 0
lrwxrwxrwx 1 root root 10 Sep 24 21:00 286321b9-c393-4d16-a6ab-de97b0718820 -> ../../sdb3
lrwxrwxrwx 1 root root 10 Sep 25 01:09 39b0f028-396a-4ca7-8ebe-c3652d53ccc0 -> ../../sda1
lrwxrwxrwx 1 root root 10 Sep 24 21:00 3ae1232b-816d-41d4-984a-3b7a7ac2c803 -> ../../sdb1
lrwxrwxrwx 1 root root 10 Sep 24 21:00 5706e131-b31a-4853-8c34-b794176ffbd7 -> ../../sdd3
lrwxrwxrwx 1 root root 10 Sep 24 21:00 6ff5a060-5fd6-4fee-aab1-766315bd4fe3 -> ../../sdd1
lrwxrwxrwx 1 root root 10 Sep 25 01:02 92a1436f-fa72-4f84-837e-56baf0c5c966 -> ../../sde1
lrwxrwxrwx 1 root root 10 Sep 24 21:00 979870b6-6509-4bba-8062-ff1aa0d4b71d -> ../../sdb2
lrwxrwxrwx 1 root root 10 Sep 24 21:00 ad0505ea-9312-4728-96a5-45b64140e457 -> ../../sdd2
lrwxrwxrwx 1 root root 10 Sep 25 01:02 ad8810bb-bb67-49c3-bfc8-e2bf1749d1c3 -> ../../sdc1
lrwxrwxrwx 1 root root 10 Sep 25 01:03 fc7b8c75-3044-45e8-9cd3-64b4e13b456a -> ../../sdf1
root@truenas[~]# zpool import -d /dev/disk/by-partuuid
  pool: p6
    id: 814428843280364485
 state: UNAVAIL
status: One or more devices are faulted.
action: The pool cannot be imported due to damaged devices or data.
config:

        p6                                        UNAVAIL  insufficient replicas
          raidz1-0                                UNAVAIL  insufficient replicas
            ad8810bb-bb67-49c3-bfc8-e2bf1749d1c3  OFFLINE
            92a1436f-fa72-4f84-837e-56baf0c5c966  ONLINE
            fc7b8c75-3044-45e8-9cd3-64b4e13b456a  ONLINE
            39b0f028-396a-4ca7-8ebe-c3652d53ccc0  FAULTED  corrupted data

Short smart run just now:

root@truenas[~]# for d in sda sdc sde sdf; do echo "=== /dev/$d ==="; smartctl -x /dev/$d; echo; done
=== /dev/sda ===
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST6000NM0095
Revision:             E004
Compliance:           SPC-4
User Capacity:        6,001,175,126,016 bytes [6.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c5009520e453
Serial number:        ZAD3JXA30000C8262SJP
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Thu Sep 25 02:32:24 2025 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled
Read Cache is:        Enabled
Writeback Cache is:   Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     31 C
Drive Trip Temperature:        60 C

Manufactured in week 03 of year 2018
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  396
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  78243
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 2999503456
  Blocks received from initiator = 2555514752
  Blocks read from cache and sent to initiator = 1417216694
  Number of read and write commands whose size <= segment size = 146333548
  Number of read and write commands whose size > segment size = 2216571

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 50423.03
  number of minutes until next internal SMART test = 19

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   1087086526        0         0  1087086526          0     113659.417           0
write:         0        0         0         0          0      30021.245           0
verify: 18866084        0         0  18866084          0         26.514           0

Non-medium error count:      233


[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -   50422                 - [-   -    -]
# 2  Background short  Completed                   -   50422                 - [-   -    -]
# 3  Background short  Completed                   -   50418                 - [-   -    -]
# 4  Background short  Completed                   -   50403                 - [-   -    -]
# 5  Background short  Completed                   -   50389                 - [-   -    -]
# 6  Background short  Completed                   -   50365                 - [-   -    -]
# 7  Background short  Completed                   -   50341                 - [-   -    -]
# 8  Background short  Completed                   -   50321                 - [-   -    -]
# 9  Background short  Completed                   -   50307                 - [-   -    -]
#10  Background short  Completed                   -   50303                 - [-   -    -]
#11  Background short  Completed                   -   50302                 - [-   -    -]
#12  Background short  Completed                   -   50285                 - [-   -    -]
#13  Background short  Completed                   -   50261                 - [-   -    -]
#14  Background short  Completed                   -   50237                 - [-   -    -]
#15  Background long   Completed                   -   50224                 - [-   -    -]
#16  Background short  Completed                   -   50213                 - [-   -    -]
#17  Background short  Completed                   -   50189                 - [-   -    -]
#18  Background short  Completed                   -   50165                 - [-   -    -]
#19  Background short  Completed                   -   50141                 - [-   -    -]
#20  Background short  Completed                   -   50117                 - [-   -    -]

Long (extended) Self-test duration: 38632 seconds [10.7 hours]

Background scan results log
  Status: no scans active
    Accumulated power on time, hours:minutes 50423:02 [3025382 minutes]
    Number of background scans performed: 0,  scan progress: 0.00%
    Number of background medium scans performed: 0
Device does not support General statistics and performance logging

Protocol Specific port log page for SAS SSP
relative target port id = 1
  generation code = 11
  number of phys = 1
  phy identifier = 0
    attached device type: SAS or SATA device
    attached reason: power on
    reason: loss of dword synchronization
    negotiated logical link rate: phy enabled; 6 Gbps
    attached initiator port: ssp=1 stp=1 smp=1
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000c5009520e451
    attached SAS address = 0x56c92bf00004817e
    attached phy identifier = 2
    Invalid DWORD count = 16
    Running disparity error count = 16
    Loss of DWORD synchronization count = 25
    Phy reset problem count = 0
relative target port id = 2
  generation code = 11
  number of phys = 1
  phy identifier = 1
    attached device type: no device attached
    attached reason: unknown
    reason: unknown
    negotiated logical link rate: phy enabled; unknown
    attached initiator port: ssp=0 stp=0 smp=0
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000c5009520e452
    attached SAS address = 0x0
    attached phy identifier = 0
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization count = 0
    Phy reset problem count = 0


=== /dev/sdc ===
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST6000NM0095
Revision:             E004
Compliance:           SPC-4
User Capacity:        6,001,175,126,016 bytes [6.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c5009520c4bf
Serial number:        ZAD3JY190000C8262QF1
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Thu Sep 25 02:32:25 2025 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled
Read Cache is:        Enabled
Writeback Cache is:   Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     31 C
Drive Trip Temperature:        60 C

Manufactured in week 03 of year 2018
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  480
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  78171
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 816031952
  Blocks received from initiator = 751352216
  Blocks read from cache and sent to initiator = 2718889444
  Number of read and write commands whose size <= segment size = 139988149
  Number of read and write commands whose size > segment size = 1932866

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 50418.53
  number of minutes until next internal SMART test = 41

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   362576046        0         0  362576046          0     114740.663           0
write:         0        0         0         0          0      29067.110           0
verify: 18735805        0         0  18735805          0         26.354           0

Non-medium error count:      374


[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -   50418                 - [-   -    -]
# 2  Background short  Completed                   -   50418                 - [-   -    -]
# 3  Background short  Completed                   -   50414                 - [-   -    -]
# 4  Background short  Completed                   -   50399                 - [-   -    -]
# 5  Background short  Completed                   -   50385                 - [-   -    -]
# 6  Background short  Completed                   -   50361                 - [-   -    -]
# 7  Background short  Completed                   -   50337                 - [-   -    -]
# 8  Background short  Completed                   -   50317                 - [-   -    -]
# 9  Background short  Completed                   -   50303                 - [-   -    -]
#10  Background short  Completed                   -   50299                 - [-   -    -]
#11  Background short  Completed                   -   50298                 - [-   -    -]
#12  Background short  Completed                   -   50281                 - [-   -    -]
#13  Background short  Completed                   -   50257                 - [-   -    -]
#14  Background short  Completed                   -   50233                 - [-   -    -]
#15  Background long   Completed                   -   50220                 - [-   -    -]
#16  Background short  Completed                   -   50209                 - [-   -    -]
#17  Background short  Completed                   -   50185                 - [-   -    -]
#18  Background short  Completed                   -   50161                 - [-   -    -]
#19  Background short  Completed                   -   50137                 - [-   -    -]
#20  Background short  Completed                   -   50113                 - [-   -    -]

Long (extended) Self-test duration: 38632 seconds [10.7 hours]

Background scan results log
  Status: no scans active
    Accumulated power on time, hours:minutes 50418:32 [3025112 minutes]
    Number of background scans performed: 0,  scan progress: 0.00%
    Number of background medium scans performed: 0
Device does not support General statistics and performance logging

Protocol Specific port log page for SAS SSP
relative target port id = 1
  generation code = 0
  number of phys = 1
  phy identifier = 0
    attached device type: SAS or SATA device
    attached reason: loss of dword synchronization
    reason: power on
    negotiated logical link rate: phy enabled; 6 Gbps
    attached initiator port: ssp=1 stp=1 smp=1
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000c5009520c4bd
    attached SAS address = 0x56c92bf00004817f
    attached phy identifier = 7
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization count = 0
    Phy reset problem count = 0
relative target port id = 2
  generation code = 0
  number of phys = 1
  phy identifier = 1
    attached device type: no device attached
    attached reason: unknown
    reason: unknown
    negotiated logical link rate: phy enabled; unknown
    attached initiator port: ssp=0 stp=0 smp=0
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000c5009520c4be
    attached SAS address = 0x0
    attached phy identifier = 0
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization count = 0
    Phy reset problem count = 0


=== /dev/sde ===
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST6000NM0095
Revision:             E004
Compliance:           SPC-4
User Capacity:        6,001,175,126,016 bytes [6.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c50095213f23
Serial number:        ZAD3K8QF0000C8264Z11
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Thu Sep 25 02:32:26 2025 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled
Read Cache is:        Enabled
Writeback Cache is:   Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     31 C
Drive Trip Temperature:        60 C

Manufactured in week 03 of year 2018
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  702
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  78465
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 1707109672
  Blocks received from initiator = 3608721656
  Blocks read from cache and sent to initiator = 2386026923
  Number of read and write commands whose size <= segment size = 144643678
  Number of read and write commands whose size > segment size = 1972171

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 50086.80
  number of minutes until next internal SMART test = 41

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   147524181        0         0  147524181          0     117395.680           0
write:         0        0         0         0          0      28153.005           0
verify: 19756817        0         0  19756817          0         26.593           0

Non-medium error count:      729


[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -   50086                 - [-   -    -]
# 2  Background short  Completed                   -   50086                 - [-   -    -]
# 3  Background short  Completed                   -   50082                 - [-   -    -]
# 4  Background short  Completed                   -   50067                 - [-   -    -]
# 5  Background short  Completed                   -   50053                 - [-   -    -]
# 6  Background short  Completed                   -   50029                 - [-   -    -]
# 7  Background short  Completed                   -   50005                 - [-   -    -]
# 8  Background short  Completed                   -   49985                 - [-   -    -]
# 9  Background short  Completed                   -   49971                 - [-   -    -]
#10  Background short  Completed                   -   49967                 - [-   -    -]
#11  Background short  Completed                   -   49966                 - [-   -    -]
#12  Background short  Completed                   -   49949                 - [-   -    -]
#13  Background short  Completed                   -   49925                 - [-   -    -]
#14  Background short  Completed                   -   49901                 - [-   -    -]
#15  Background long   Completed                   -   49889                 - [-   -    -]
#16  Background short  Completed                   -   49877                 - [-   -    -]
#17  Background short  Completed                   -   49853                 - [-   -    -]
#18  Background short  Completed                   -   49829                 - [-   -    -]
#19  Background short  Completed                   -   49805                 - [-   -    -]
#20  Background short  Completed                   -   49781                 - [-   -    -]

Long (extended) Self-test duration: 38632 seconds [10.7 hours]

Background scan results log
  Status: no scans active
    Accumulated power on time, hours:minutes 50086:48 [3005208 minutes]
    Number of background scans performed: 0,  scan progress: 0.00%
    Number of background medium scans performed: 0
Device does not support General statistics and performance logging

Protocol Specific port log page for SAS SSP
relative target port id = 1
  generation code = 0
  number of phys = 1
  phy identifier = 0
    attached device type: SAS or SATA device
    attached reason: loss of dword synchronization
    reason: power on
    negotiated logical link rate: phy enabled; 6 Gbps
    attached initiator port: ssp=1 stp=1 smp=1
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000c50095213f21
    attached SAS address = 0x56c92bf000048180
    attached phy identifier = 0
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization count = 0
    Phy reset problem count = 0
relative target port id = 2
  generation code = 0
  number of phys = 1
  phy identifier = 1
    attached device type: no device attached
    attached reason: unknown
    reason: unknown
    negotiated logical link rate: phy enabled; unknown
    attached initiator port: ssp=0 stp=0 smp=0
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000c50095213f22
    attached SAS address = 0x0
    attached phy identifier = 0
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization count = 0
    Phy reset problem count = 0


=== /dev/sdf ===
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               SEAGATE
Product:              ST6000NM0095
Revision:             E004
Compliance:           SPC-4
User Capacity:        6,001,175,126,016 bytes [6.00 TB]
Logical block size:   512 bytes
Physical block size:  4096 bytes
LU is fully provisioned
Rotation Rate:        7200 rpm
Form Factor:          3.5 inches
Logical Unit id:      0x5000c5009520d537
Serial number:        ZAD3JXJQ0000C8262RAP
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Thu Sep 25 02:32:27 2025 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Enabled
Read Cache is:        Enabled
Writeback Cache is:   Enabled

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Grown defects during certification <not available>
Total blocks reassigned during format <not available>
Total new blocks reassigned <not available>
Power on minutes since format <not available>
Current Drive Temperature:     30 C
Drive Trip Temperature:        60 C

Manufactured in week 03 of year 2018
Specified cycle count over device lifetime:  10000
Accumulated start-stop cycles:  529
Specified load-unload count over device lifetime:  300000
Accumulated load-unload cycles:  78159
Elements in grown defect list: 0

Vendor (Seagate Cache) information
  Blocks sent to initiator = 2309912688
  Blocks received from initiator = 1080820352
  Blocks read from cache and sent to initiator = 2705944289
  Number of read and write commands whose size <= segment size = 152082454
  Number of read and write commands whose size > segment size = 2060547

Vendor (Seagate/Hitachi) factory information
  number of hours powered up = 50415.40
  number of minutes until next internal SMART test = 41

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:   1998782114        0         0  1998782114          0     115505.372           0
write:         0        0         0         0          0      29202.479           0
verify: 19576823        0         0  19576823          0         26.512           0

Non-medium error count:      165


[GLTSD (Global Logging Target Save Disable) set. Enable Save with '-S on']
SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -   50415                 - [-   -    -]
# 2  Background short  Completed                   -   50415                 - [-   -    -]
# 3  Background short  Completed                   -   50411                 - [-   -    -]
# 4  Background short  Completed                   -   50396                 - [-   -    -]
# 5  Background short  Completed                   -   50382                 - [-   -    -]
# 6  Background short  Completed                   -   50358                 - [-   -    -]
# 7  Background short  Completed                   -   50334                 - [-   -    -]
# 8  Background short  Completed                   -   50314                 - [-   -    -]
# 9  Background short  Completed                   -   50300                 - [-   -    -]
#10  Background short  Completed                   -   50295                 - [-   -    -]
#11  Background short  Completed                   -   50294                 - [-   -    -]
#12  Background short  Completed                   -   50278                 - [-   -    -]
#13  Background short  Completed                   -   50254                 - [-   -    -]
#14  Background short  Completed                   -   50230                 - [-   -    -]
#15  Background long   Completed                   -   50217                 - [-   -    -]
#16  Background short  Completed                   -   50206                 - [-   -    -]
#17  Background short  Completed                   -   50182                 - [-   -    -]
#18  Background short  Completed                   -   50158                 - [-   -    -]
#19  Background short  Completed                   -   50134                 - [-   -    -]
#20  Background short  Completed                   -   50110                 - [-   -    -]

Long (extended) Self-test duration: 38632 seconds [10.7 hours]

Background scan results log
  Status: no scans active
    Accumulated power on time, hours:minutes 50415:24 [3024924 minutes]
    Number of background scans performed: 0,  scan progress: 0.00%
    Number of background medium scans performed: 0
Device does not support General statistics and performance logging

Protocol Specific port log page for SAS SSP
relative target port id = 1
  generation code = 0
  number of phys = 1
  phy identifier = 0
    attached device type: SAS or SATA device
    attached reason: unknown
    reason: unknown
    negotiated logical link rate: phy enabled; 6 Gbps
    attached initiator port: ssp=1 stp=1 smp=1
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000c5009520d535
    attached SAS address = 0x56c92bf000048181
    attached phy identifier = 3
    Invalid DWORD count = 4
    Running disparity error count = 4
    Loss of DWORD synchronization count = 1
    Phy reset problem count = 0
relative target port id = 2
  generation code = 0
  number of phys = 1
  phy identifier = 1
    attached device type: no device attached
    attached reason: unknown
    reason: unknown
    negotiated logical link rate: phy enabled; unknown
    attached initiator port: ssp=0 stp=0 smp=0
    attached target port: ssp=0 stp=0 smp=0
    SAS address = 0x5000c5009520d536
    attached SAS address = 0x0
    attached phy identifier = 0
    Invalid DWORD count = 0
    Running disparity error count = 0
    Loss of DWORD synchronization count = 0
    Phy reset problem count = 0


root@truenas[~]#

Thank you for your guidance!
(the sun is rising, time for bed)

TheColin21 · September 25, 2025, 9:49am

You should definitely stop just copy pasting stuff from ChatGPT - into this forum (the first post was way longer than needed and the second one as well (why paste the shell’s motd?)) but especially into the CLI.

I am pretty sure that offlining more than one device from a RAIDZ1 wasn’t your idea but some AI hallucination thing.

All your HDDs have lots of of non medium errors logged which might be due to the MEGARAID but also insanely high ECC corrected error counts.

Wait for an expert to show up (if they read that long AI generated post) and don’t touch that system in the meantime.

Farout · September 25, 2025, 9:57am

Yes, and plz dont be that guy.

joeschmuck · September 25, 2025, 12:13pm

Sort of. I suspect you will need to roll back your transactions but I am not the correct person to tell you how or even if this is the smart thing to do. We have @HoneyBadger, @etorix , @Arwen and many others more familiar with this type of problem. As @TheColin21 has said, don’t do anything until you hear from someone who knows what they are doing.

While I’m sure you realize that you should not be forcing your system to do things it doesn’t want to do, this is a great lesson for others wanting to do something similar, Don’t Do It.

Where did you get this command line from? I ask because you absolutely do not know what it does. It was a successful command at offlining one of your drives in your pool. And trying to offline the other drives was a bad move as well. Your other attempts may have ensured your data death by trying to remove other drives from your active RAIDZ1 pool.

Arwen · September 25, 2025, 3:00pm

Reading through the information, the pool is not recoverable in my limited opinion. This is based on:

You were using a hardware RAID controller. MegaRAID even in JBOD mode is not suitable. It may work and work for years, but if / when problems happen, they can be catastrophic. See elevator seeks & writes.
Offlining a disk on a RAID-Z1 to reduce power usage, caused the pool to be at risk, no redundancy.
Exporting the pool / rebooting simply added more risk because a single fault in the remaining disks could ruin the pool
The output from zdb -l indicates you have lost more than 1 disk, in a vDev that can only loose 1 disk.

Offlining a disk basically takes it out of pool writes. If too much time or writes happen, it would need a full re-sync when Onlining, not an update of the changes since it was offlined.

To remind others reading this thread in the future, TrueNAS and or ZFS are not the only or best NAS for all uses. ZFS was not designed originally to save power. In fact, it is known to use more disk space and take more power than other file systems. (Redundant metadata, Merkle tree copy on write updates, variable width stripe writes all take more space, aka more write time & power. Plus, scrubs, which should be done regularly, take up power.)

Now where you go from here are 3 choices:

Recovery service
Klennet ZFS Recovery software
Manually attempt to figure out how to get one of your failed disks back.

The last would be easier if you had a same or larger sized spare disk you could copy the original to, and then work on the spare. Re-doing the copy as needed after failed fix attempts.

Leo_Zendo · September 25, 2025, 5:11pm

Thanks all for the guidance.

I’m starting full ddrescue imaging of the OFFLINE member (ad8810bb…/sdc) to a same-size spare, with retries and a log, and will verify labels on the clone. If that doesn’t yield readable labels, I’ll image the FAULTED member (39b0f028…/sda) next. I won’t write to the original disks. I’ll report back with zdb label/uberblock output from the clone and results of read-only import attempts (-F/-X).

For a bit more info for the exact timeline of what happeded:
Here are the exact commands you pasted during the offlining phase, in order, with their outcomes.

Tried to offline the whole pool (invalid; needs device names)

zpool offline p6

Error excerpts:

usage:
        offline [--power]|[[-f][-t]] <pool> <device> ...
zsh: number expected
missing device name

Offlined one device (this succeeded; pool went DEGRADED)

zpool offline p6 ad8810bb-bb67-49c3-bfc8-e2bf1749d1c3

Attempted to offline the remaining three (all failed: would leave no valid replicas in RAIDZ1)

zpool offline p6 92a1436f-fa72-4f84-837e-56baf0c5c966
zpool offline p6 fc7b8c75-3044-45e8-9cd3-64b4e13b456a
zpool offline p6 39b0f028-396a-4ca7-8ebe-c3652d53ccc0

Error (for each):

cannot offline <uuid>: no valid replicas

Tried to offline all three in one line (same failures)

zpool offline p6 92a1436f-fa72-4f84-837e-56baf0c5c966 fc7b8c75-3044-45e8-9cd3-64b4e13b456a 39b0f028-396a-4ca7-8ebe-c3652d53ccc0

Errors:

cannot offline 92a1436f-fa72-4f84-837e-56baf0c5c966: no valid replicas
cannot offline fc7b8c75-3044-45e8-9cd3-64b4e13b456a: no valid replicas
cannot offline 39b0f028-396a-4ca7-8ebe-c3652d53ccc0: no valid replicas

Attempted export (pool was busy)

zpool export p6

Error:

cannot export 'p6': pool is busy

Status showing one OFFLINE and three ONLINE (later)

zpool status p6 -v

Key lines:

p6  DEGRADED
  raidz1-0  DEGRADED
    ad8810bb-...  OFFLINE
    92a1436f-...  ONLINE
    fc7b8c75-...  ONLINE
    39b0f028-...  ONLINE (later became FAULTED)

Re-running the same offline requests did nothing - so I exported the pool (without deleting metadata) in GUI.

That was the end of it…

Thank you all very much for your guidance! While I agree TrueNAS isn’t the most approperate system for my need, the community apps setup kept me using it. I will switch to a more open linux or windows GUI + docker system if you recommand.

HoneyBadger · September 25, 2025, 6:42pm

Please pull the command you are attempting to use from your .zsh-histfile that you used when you “attempted to” and “forced” the spin down of all drives both in the first and second steps here:

I was reducing power draw and attempted to spin down all p6 drives.

Pool busy, not allowing spindown

Forced spindown

Leo_Zendo · September 25, 2025, 10:51pm

The full commands are very long with lots of printouts just to discover the system. Here is the digust from the full chat log:

Here’s the exact command timeline you ran around “attempted” and “forced” spindown, in order, with the terminal responses.

Phase A — initial spindown attempts (before reboot)

Attempted to offline the entire pool (invalid; needs devices):

zpool offline p6

Output:

usage:
        offline [--power]|[[-f][-t]] <pool> <device> ...
zsh: number expected
missing device name

Offlined one member (this succeeded; pool went DEGRADED):

zpool offline p6 ad8810bb-bb67-49c3-bfc8-e2bf1749d1c3

Tried to offline the remaining three (all failed: would leave no valid replicas for RAIDZ1):

zpool offline p6 92a1436f-fa72-4f84-837e-56baf0c5c966
zpool offline p6 fc7b8c75-3044-45e8-9cd3-64b4e13b456a
zpool offline p6 39b0f028-396a-4ca7-8ebe-c3652d53ccc0

Output (for each):

cannot offline <uuid>: no valid replicas

Tried to offline the three in one line (same failure for each):

zpool offline p6 92a1436f-fa72-4f84-837e-56baf0c5c966 fc7b8c75-3044-45e8-9cd3-64b4e13b456a 39b0f028-396a-4ca7-8ebe-c3652d53ccc0

Output:

cannot offline 92a1436f-fa72-4f84-837e-56baf0c5c966: no valid replicas
cannot offline fc7b8c75-3044-45e8-9cd3-64b4e13b456a: no valid replicas
cannot offline 39b0f028-396a-4ca7-8ebe-c3652d53ccc0: no valid replicas

Attempted export (pool was busy):

zpool export p6

Output:

cannot export 'p6': pool is busy

GUI export of the pool (no CLI output; user action in WebUI)
Reboot

Phase B — after reboot (import failures, discovery)

Import attempt:

zpool import p6

Output:

cannot import 'p6': one or more devices is currently unavailable

Force/rewind import attempts (all failed similarly):

zpool import -f -F p6
zpool import -f -F -X p6
zpool import -d /dev/disk/by-partuuid -f -N -F -X p6
zpool import -d /dev/disk/by-partuuid -f -N -F -T 7450600 p6
zpool import -d /tmp/p6 -f -N -F -X 814428843280364485
zpool import -d /dev -f -N -F -X p6

Representative output:

cannot import 'p6': one or more devices is currently unavailable

Status discovery showing OFFLINE/FAULTED layout (representative):

zpool import -d /dev/disk/by-partuuid

Output:

p6  UNAVAIL  insufficient replicas
  raidz1-0  UNAVAIL  insufficient replicas
    ad8810bb-...  OFFLINE
    92a1436f-...  ONLINE
    fc7b8c75-...  ONLINE
    39b0f028-...  FAULTED  corrupted data

Label checks (good members show child “offline: 1”; two members’ labels unreadable):

zdb -l /dev/sde1    # readable; shows children and offline: 1 for ad8810bb...
zdb -l /dev/sdf1    # readable; shows children and offline: 1 for ad8810bb...
zdb -l /dev/sdc1    # ad8810bb... → failed to unpack label 0/1/2/3
zdb -l /dev/sda1    # 39b0f028... → failed to unpack label 0/1/2/3

Phase C — other spin/down attempts that did NOT execute (or errored immediately)

FreeBSD-style device names on SCALE (no matches):

ls -la /dev/da*
ls -la /dev/ada*

Output:

zsh: no matches found: /dev/da*
zsh: no matches found: /dev/ada*

hdparm to nonexistent device names (on SCALE, drives were sdX):

hdparm -Y /dev/da0

Output:

/dev/da0: No such file or directory

Comment lines accidentally executed (zsh treated ‘# …’ as command):

# This will completely shut down the pool and spin down all drives

Output:

zsh: command not found: #

That’s the full, chronological command trail around offlining/spindown, with errors.

HoneyBadger · September 26, 2025, 1:24pm

This reads like ChatGPT or another LLM filtering the responses (“the exact command timeline ‘you’ ran”). Raw commands pasted from your histfile are preferred, in case there’s an hdparm or similar it’s skipping over.

But if this is accurate:

 p6  UNAVAIL  insufficient replicas
  raidz1-0  UNAVAIL  insufficient replicas
    ad8810bb-...  OFFLINE
    92a1436f-...  ONLINE
    fc7b8c75-...  ONLINE
    39b0f028-...  FAULTED  corrupted data

You offlined ad8810bb which degraded the RAIDZ1. If something faulted on 39b0f028 during a period when you had no redundancy, that puts two drives in a Z1 offline and therefore makes your pool UNAVAIL.

The make and model of your drives and details about your motherboard/storage controller may also be relevant here, as you have unreadable ZFS labels on two of your disks.

Leo_Zendo · September 29, 2025, 5:43pm

I no longer have commands run originally because that was on a different machine. Truenas crashes so frequenctly i have a whole dedicated setup just to recover things.

aftering dd ing the OFFLINE disk to a unrelated disk:
truenas_admin@truenas[~]$ ect status=progress
truenas_admin@truenas[~]$
truenas_admin@truenas[~]$ dd if=/dev/sde of=/dev/sdb bs=4M conv=noerror,sync iflag=direct oflag=direct status=progress
dd: failed to open ‘/dev/sde’: Permission denied oflag=direct status=progre
truenas_admin@truenas[~]$ sudo -i ss
[sudo] password for truenas_admin:
root@truenas[~]# dd if=/dev/sde of=/dev/sdb bs=4M conv=noerror,sync iflag=direct oflag=direct status=progress
301989888 bytes (302 MB, 288 MiB) copied, 3 s, 99.4 MB/s^C ect status=progress
89+0 records in
89+0 records out
373293056 bytes (373 MB, 356 MiB) copied, 3.76001 s, 99.3 MB/s

root@truenas[~]# dd if=/dev/sde of=/dev/sdb bs=32M conv=noerror,sync iflag=direct oflag=direct status=progress rect status=progress
6001176608768 bytes (6.0 TB, 5.5 TiB) copied, 72280 s, 83.0 MB/s
178848+1 records in
178849+0 records out
6001176608768 bytes (6.0 TB, 5.5 TiB) copied, 72279.5 s, 83.0 MB/s
root@truenas[~]# sudo partprobe /dev/sdb || true
sudo /usr/sbin/zdb -l /dev/sdb1 | head -n 120
Warning: Not all of the space available to /dev/sdb appears to be used, you can fix the GPT to use all of the space (an extra 3641331096 blocks) or continue with the current setting?
failed to unpack label 0
failed to unpack label 1
failed to unpack label 2
failed to unpack label 3
root@truenas[~]#

I guess its time to say goodbye forever?