Unable to import pool due to I/O errors ZFS

Hello,

A week ago, I started ripping DVDs with my TrueNAS server. That night, I saw that 2 of my 3 drives had way too many checksum errors, about 31K each. I assumed that all of these errors were due to all three drives and the DVD ripper being installed on the same PCIe to SATA adapter. I tried to export the pool and reimport it to see if the drives still had errors (this has worked for me in the past, probably because I got lucky) but I was unable to reimport the pool. So I left it there powered off and ordered a real HBA.

Anyway, here we are now. I got a new LSI 9221-8I HBA flashed to it in mode installed, and ZFS will not let me import the pool. I have tried multiple flags while using zpool import such as -f -fFXn -a and a bit more. Every time I sent these commands, the terminal would stop responding. In the log, I have found this warning/error message

Pool 'bigdady' has encountered an uncorrectable I/O failure and has been suspended

I attempted to import the pool on a fresh TrueNAS installation, but encountered the same error as before. Then I booted using a backup configuration file where the pool was already in an imported state. While the system did boot, ZFS failed to start and remained unresponsive.
To troubleshoot, I performed multiple test bootups. I discovered the system could boot successfully with just one drive connected — likely the one without checksum errors. However, once TrueNAS loaded, that drive wasn’t recognized as part of the pool. I reconnected the other two drives, but no changes occurred and the pool still wasn’t detected.

I am now running Long SMART tests on all drives, plus my spare. I will share the results as soon as they are available.

In the meantime, I would like to know what the best thing to do is or if my data is totally corrupt. I for sure understand that most of those checksum errors are in the DVD rips, and I can re-rip those, but I also have other data on this pool that would be worth recovering.

What are the possible outcomes? Is my data all screwed? Or is there possibly a way to save it at home? I could try and import it to another Linux machine, but I would like some help from people with greater knowledge.

Thanks in advance!

My Server

-ElectricEel-24.10.2.2
-Mobo Asrock Q1900m
-CPU Intel Celeron J1900
-16 GB RAM
-250 GB SanDisk Ultra boot drive
-Generic pcie to 4 sata >> now repkaced by LSI 9221-8I HBA
-Asus DVD drive
-Generic 500w PSU

  • HDD
    -1x2tb ironwolf
    -1x2tb baracuda
    -2x4 TB IronWolf (1 of which is a spare)

here are my SMART test results

Results for 2TB Ironwolf /dev/sdb
root@truenas[~]# smartctl -a /dev/sdb
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.44-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate IronWolf
Device Model:     ST2000VN004-2E4164
Serial Number:    XXXXXXXX
LU WWN Device Id: 5 000c50 0b5b878d7
Firmware Version: SC60
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5900 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/5528
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Jul  9 00:55:35 2025 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (  107) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 263) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x10bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   117   099   006    Pre-fail  Always       -       125509152
  3 Spin_Up_Time            0x0003   096   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   098   098   020    Old_age   Always       -       3042
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       4816608319
  9 Power_On_Hours          0x0032   054   054   000    Old_age   Always       -       40907
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   098   098   020    Old_age   Always       -       3065
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   001   001   000    Old_age   Always       -       1880
190 Airflow_Temperature_Cel 0x0022   077   046   045    Old_age   Always       -       23 (Min/Max 23/25)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   099   000    Old_age   Always       -       1945
193 Load_Cycle_Count        0x0032   059   059   000    Old_age   Always       -       83678
194 Temperature_Celsius     0x0022   023   054   000    Old_age   Always       -       23 (0 15 0 0 0)
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     40903         -
# 2  Short offline       Completed without error       00%     40896         -
# 3  Extended offline    Aborted by host               90%     40896         -
# 4  Short offline       Completed without error       00%     40896         -
# 5  Extended offline    Interrupted (host reset)      00%     40896         -
# 6  Short offline       Completed without error       00%     28340         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Results for 2TB Baracuda /dev/sdc
root@truenas[~]# smartctl -a /dev/sdc
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.44-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Seagate BarraCuda 3.5 (SMR)
Device Model:     ST2000DM008-2FR102
Serial Number:    XXXXXXXX
LU WWN Device Id: 5 000c50 0c8e69d03
Firmware Version: 0001
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    7200 rpm
Form Factor:      3.5 inches
TRIM Command:     Available
Device is:        In smartctl database 7.3/5528
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Jul  9 01:01:21 2025 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 201) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x30a5) SCT Status supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   080   064   006    Pre-fail  Always       -       90001377
  3 Spin_Up_Time            0x0003   098   097   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       895
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   092   060   045    Pre-fail  Always       -       1462908512
  9 Power_On_Hours          0x0032   087   087   000    Old_age   Always       -       11685h+20m+38.504s
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       315
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0 0 0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   073   055   040    Old_age   Always       -       27 (Min/Max 26/29)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       501
193 Load_Cycle_Count        0x0032   097   097   000    Old_age   Always       -       6816
194 Temperature_Celsius     0x0022   027   045   000    Old_age   Always       -       27 (0 18 0 0 0)
195 Hardware_ECC_Recovered  0x001a   080   064   000    Old_age   Always       -       90001377
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       11052h+27m+52.930s
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       16686157192
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       35231888580

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     11680         -
# 2  Short offline       Completed without error       00%     11673         -
# 3  Extended offline    Aborted by host               90%     11673         -
# 4  Extended offline    Interrupted (host reset)      00%     11673         -
# 5  Short offline       Aborted by host               70%     11668         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Results for 4TB Ironwolf /dev/sdd
root@truenas[~]# smartctl -a /dev/sdd
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.44-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     ST4000VN006-3CW104
Serial Number:    XXXXXXXX
LU WWN Device Id: 5 000c50 0e8d884ce
Firmware Version: SC60
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database 7.3/5528
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Jul  9 01:04:14 2025 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 459) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x70bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   079   064   006    Pre-fail  Always       -       84149798
  3 Spin_Up_Time            0x0003   096   096   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       531
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   074   060   045    Pre-fail  Always       -       25577349
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -       3014
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       530
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   074   059   040    Old_age   Always       -       26 (Min/Max 25/27)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       621
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       652
194 Temperature_Celsius     0x0022   026   041   000    Old_age   Always       -       26 (0 22 0 0 0)
195 Hardware_ECC_Recovered  0x001a   079   064   000    Old_age   Always       -       84149798
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       2968 (240 132 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       3920999960
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       64340198574

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%      3013         -
# 2  Short offline       Completed without error       00%      3002         -
# 3  Extended offline    Aborted by host               90%      3002         -
# 4  Short offline       Completed without error       00%      3002         -
# 5  Extended offline    Interrupted (host reset)      00%      3002         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

+++

Results for boot pool
oot@truenas[~]# smartctl -a /dev/sda
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.44-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Marvell based SanDisk SSDs
Device Model:     SanDisk SDSSDHII240G
Serial Number:    XXXXXXXXXXXX
LU WWN Device Id: 5 001b44 e5f98ef7f
Firmware Version: X31200RL
User Capacity:    240,057,409,536 bytes [240 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Form Factor:      2.5 inches
TRIM Command:     Available, deterministic, zeroed
Device is:        In smartctl database 7.3/5528
ATA Version is:   ACS-2 T13/2015-D revision 3
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Wed Jul  9 01:05:50 2025 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x11) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  10) minutes.

SMART Attributes Data Structure revision number: 4
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  5 Reallocated_Sector_Ct   0x0032   100   100   ---    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   253   100   ---    Old_age   Always       -       12706
 12 Power_Cycle_Count       0x0032   100   100   ---    Old_age   Always       -       3965
165 Total_Write/Erase_Count 0x0032   100   100   ---    Old_age   Always       -       928004181836
166 Min_W/E_Cycle           0x0032   100   100   ---    Old_age   Always       -       30
167 Min_Bad_Block/Die       0x0032   100   100   ---    Old_age   Always       -       31
168 Maximum_Erase_Cycle     0x0032   100   100   ---    Old_age   Always       -       169
169 Total_Bad_Block         0x0032   100   100   ---    Old_age   Always       -       0
171 Program_Fail_Count      0x0032   100   100   ---    Old_age   Always       -       0
172 Erase_Fail_Count        0x0032   100   100   ---    Old_age   Always       -       0
173 Avg_Write/Erase_Count   0x0032   100   100   ---    Old_age   Always       -       105
174 Unexpect_Power_Loss_Ct  0x0032   100   100   ---    Old_age   Always       -       246
187 Reported_Uncorrect      0x0032   100   100   ---    Old_age   Always       -       0
194 Temperature_Celsius     0x0022   074   046   ---    Old_age   Always       -       26 (Min/Max 13/46)
199 SATA_CRC_Error          0x0032   100   100   ---    Old_age   Always       -       0
230 Perc_Write/Erase_Count  0x0032   100   100   ---    Old_age   Always       -       1541 5376 5376
232 Perc_Avail_Resrvd_Space 0x0033   100   100   004    Pre-fail  Always       -       100
233 Total_NAND_Writes_GiB   0x0032   100   100   ---    Old_age   Always       -       24519
234 Perc_Write/Erase_Ct_BC  0x0032   100   100   ---    Old_age   Always       -       41212
241 Total_Writes_GiB        0x0030   253   253   ---    Old_age   Offline      -       17214
242 Total_Reads_GiB         0x0030   253   253   ---    Old_age   Offline      -       7576
244 Thermal_Throttle        0x0032   000   100   ---    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     12705         -

Selective Self-tests/Logging not supported
Results for spare
root@truenas[~]# smartctl -a /dev/sde
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.6.44-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     ST4000VN006-3CW104
Serial Number:    XXXXXXXX
LU WWN Device Id: 5 000c50 0fa4d83ea
Firmware Version: SC60
User Capacity:    4,000,787,030,016 bytes [4.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        Not in smartctl database 7.3/5528
ATA Version is:   ACS-3 T13/2161-D revision 5
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Jul  9 01:07:34 2025 EDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever 
                                        been run.
Total time to complete Offline 
data collection:                (    0) seconds.
Offline data collection
capabilities:                    (0x73) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   1) minutes.
Extended self-test routine
recommended polling time:        ( 462) minutes.
Conveyance self-test routine
recommended polling time:        (   2) minutes.
SCT capabilities:              (0x70bd) SCT Status supported.
                                        SCT Error Recovery Control supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   006    Pre-fail  Always       -       114464
  3 Spin_Up_Time            0x0003   095   095   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       8
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   063   060   045    Pre-fail  Always       -       2139589
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       -       8
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       8
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   075   073   040    Old_age   Always       -       25 (Min/Max 22/27)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       7
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       9
194 Temperature_Celsius     0x0022   025   040   000    Old_age   Always       -       25 (0 22 0 0 0)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       114464
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       8 (20 204 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       0
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       114464

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%         7         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Also after the Smart tests I have run zpool import -f -F -m bigdady and returned the same error as before

WARNING: Pool 'bigdady' has encountered an uncorrectable I/O failure and has been suspended.

Fallowed by:

INFO: task middlewared (wo:17395 blocked for more than 120 seconds.
[27068.047550]       Tainted: P           OE      6.6.44-production+truenas #1
[27068.048757] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27068.050023] task:middlewared (wo state:D stack:0     pid:17395 ppid:935    flags:0x00000002
[27068.050039] Call Trace:
[27068.050044]  <TASK>
[27068.050052]  __schedule+0x349/0x950
[27068.050071]  schedule+0x5b/0xa0
[27068.050082]  cv_wait_common+0xf0/0x130 [spl]
[27068.050144]  ? __pfx_autoremove_wake_function+0x10/0x10
[27068.050158]  spa_lookup+0x51/0x140 [zfs]
[27068.051282]  spa_open_common+0x79/0x440 [zfs]
[27068.052404]  spa_get_stats+0x4e/0x210 [zfs]
[27068.053523]  ? spl_kmem_alloc_impl+0xb4/0xf0 [spl]
[27068.053591]  zfs_ioc_pool_stats+0x40/0x90 [zfs]
[27068.054718]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27068.055806]  ? __kmalloc_node+0xc6/0x150
[27068.055825]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27068.056862]  __x64_sys_ioctl+0x97/0xd0
[27068.056876]  do_syscall_64+0x59/0xb0
[27068.056886]  ? do_syscall_64+0x65/0xb0
[27068.056894]  ? syscall_exit_to_user_mode+0x22/0x40
[27068.056903]  ? do_syscall_64+0x65/0xb0
[27068.056911]  ? iterate_dir+0x118/0x170
[27068.056919]  ? __x64_sys_getdents64+0x10a/0x130
[27068.056927]  ? __pfx_filldir64+0x10/0x10
[27068.056936]  ? syscall_exit_to_user_mode+0x22/0x40
[27068.056944]  ? do_syscall_64+0x65/0xb0
[27068.056952]  ? syscall_exit_to_user_mode+0x22/0x40
[27068.056961]  ? do_syscall_64+0x65/0xb0
[27068.056968]  ? syscall_exit_to_user_mode+0x22/0x40
[27068.056976]  ? do_syscall_64+0x65/0xb0
[27068.056983]  ? __irq_exit_rcu+0x3b/0xc0
[27068.056995]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27068.057005] RIP: 0033:0x7f2af91bfc5b
[27068.057015] RSP: 002b:00007ffc63e02fa0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27068.057026] RAX: ffffffffffffffda RBX: 0000000005459810 RCX: 00007f2af91bfc5b
[27068.057031] RDX: 00007ffc63e03020 RSI: 0000000000005a05 RDI: 000000000000001d
[27068.057036] RBP: 00007ffc63e06610 R08: 00007f2af92953d0 R09: 00007f2af92953d0
[27068.057041] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc63e03020
[27068.057046] R13: 0000000005459810 R14: 00000000058da640 R15: 00007ffc63e06624
[27068.057056]  </TASK>
[27068.057070] INFO: task zpool:18307 blocked for more than 120 seconds.
[27068.058401]       Tainted: P           OE      6.6.44-production+truenas #1
[27068.059698] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27068.061022] task:zpool           state:D stack:0     pid:18307 ppid:1      flags:0x00004006
[27068.061035] Call Trace:
[27068.061039]  <TASK>
[27068.061046]  __schedule+0x349/0x950
[27068.061059]  schedule+0x5b/0xa0
[27068.061067]  io_schedule+0x46/0x70
[27068.061077]  cv_wait_common+0xaa/0x130 [spl]
[27068.061138]  ? __pfx_autoremove_wake_function+0x10/0x10
[27068.061150]  txg_wait_synced_impl+0xc0/0x110 [zfs]
[27068.062297]  txg_wait_synced+0x10/0x40 [zfs]
[27068.063414]  spa_load_impl.constprop.0+0x3ff/0x5b0 [zfs]
[27068.064535]  spa_load+0x73/0x120 [zfs]
[27068.065653]  spa_load_best+0x54/0x250 [zfs]
[27068.066819]  spa_import+0x231/0x690 [zfs]
[27068.067956]  zfs_ioc_pool_import+0x15b/0x180 [zfs]
[27068.069036]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27068.070159]  ? __kmalloc_node+0xc6/0x150
[27068.070180]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27068.071155]  __x64_sys_ioctl+0x97/0xd0
[27068.071170]  do_syscall_64+0x59/0xb0
[27068.071180]  ? syscall_exit_to_user_mode+0x22/0x40
[27068.071188]  ? do_syscall_64+0x65/0xb0
[27068.071196]  ? __irq_exit_rcu+0x3b/0xc0
[27068.071207]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27068.071217] RIP: 0033:0x7fe57247cc5b
[27068.071226] RSP: 002b:00007ffcd7eb4cf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27068.071234] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe57247cc5b
[27068.071239] RDX: 00007ffcd7eb4db0 RSI: 0000000000005a02 RDI: 0000000000000003
[27068.071244] RBP: 00007ffcd7eb94a0 R08: 00007fe572552440 R09: 00007fe572552440
[27068.071248] R10: 0000000000000000 R11: 0000000000000246 R12: 0000558b46d82dd0
[27068.071253] R13: 00007ffcd7eb4db0 R14: 00007fe558003df0 R15: 0000558b46e266c0
[27068.071262]  </TASK>
[27068.071281] INFO: task txg_sync:18443 blocked for more than 120 seconds.
[27068.072527]       Tainted: P           OE      6.6.44-production+truenas #1
[27068.073761] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27068.075037] task:txg_sync        state:D stack:0     pid:18443 ppid:2      flags:0x00004000
[27068.075050] Call Trace:
[27068.075054]  <TASK>
[27068.075060]  __schedule+0x349/0x950
[27068.075072]  schedule+0x5b/0xa0
[27068.075079]  schedule_timeout+0x98/0x160
[27068.075089]  ? __pfx_process_timeout+0x10/0x10
[27068.075098]  io_schedule_timeout+0x50/0x80
[27068.075108]  __cv_timedwait_common+0x12a/0x160 [spl]
[27068.075163]  ? __pfx_autoremove_wake_function+0x10/0x10
[27068.075173]  __cv_timedwait_io+0x19/0x20 [spl]
[27068.075225]  zio_wait+0x124/0x240 [zfs]
[27068.076083]  dbuf_read+0x462/0x510 [zfs]
[27068.076934]  dmu_buf_will_dirty_impl+0x7c/0x1b0 [zfs]
[27068.077784]  dmu_write_impl+0x47/0xe0 [zfs]
[27068.078680]  dmu_write+0xb6/0x110 [zfs]
[27068.079514]  space_map_write_intro_debug+0xaf/0xe0 [zfs]
[27068.080319]  space_map_write_impl+0x54/0x250 [zfs]
[27068.081121]  ? dbuf_find_dirty_eq+0x9/0x20 [zfs]
[27068.081944]  space_map_write+0x9e/0x190 [zfs]
[27068.082702]  metaslab_flush+0xf1/0x330 [zfs]
[27068.083442]  ? spa_estimate_metaslabs_to_flush+0x108/0x130 [zfs]
[27068.084171]  spa_flush_metaslabs+0x152/0x210 [zfs]
[27068.084898]  spa_sync_iterate_to_convergence+0x153/0x200 [zfs]
[27068.085676]  spa_sync+0x30a/0x600 [zfs]
[27068.086467]  txg_sync_thread+0x1ec/0x270 [zfs]
[27068.087175]  ? __pfx_txg_sync_thread+0x10/0x10 [zfs]
[27068.087839]  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
[27068.087878]  thread_generic_wrapper+0x5e/0x70 [spl]
[27068.087914]  kthread+0xe8/0x120
[27068.087923]  ? __pfx_kthread+0x10/0x10
[27068.087927]  ret_from_fork+0x34/0x50
[27068.087934]  ? __pfx_kthread+0x10/0x10
[27068.087938]  ret_from_fork_asm+0x1b/0x30
[27068.087945]  </TASK>
[27188.878748] INFO: task middlewared (wo:17395 blocked for more than 241 seconds.
[27188.879946]       Tainted: P           OE      6.6.44-production+truenas #1
[27188.881149] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27188.882417] task:middlewared (wo state:D stack:0     pid:17395 ppid:935    flags:0x00000002
[27188.882432] Call Trace:
[27188.882437]  <TASK>
[27188.882446]  __schedule+0x349/0x950
[27188.882465]  schedule+0x5b/0xa0
[27188.882477]  cv_wait_common+0xf0/0x130 [spl]
[27188.882539]  ? __pfx_autoremove_wake_function+0x10/0x10
[27188.882552]  spa_lookup+0x51/0x140 [zfs]
[27188.883680]  spa_open_common+0x79/0x440 [zfs]
[27188.884803]  spa_get_stats+0x4e/0x210 [zfs]
[27188.885922]  ? spl_kmem_alloc_impl+0xb4/0xf0 [spl]
[27188.885989]  zfs_ioc_pool_stats+0x40/0x90 [zfs]
[27188.887116]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27188.888209]  ? __kmalloc_node+0xc6/0x150
[27188.888226]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27188.889265]  __x64_sys_ioctl+0x97/0xd0
[27188.889278]  do_syscall_64+0x59/0xb0
[27188.889288]  ? do_syscall_64+0x65/0xb0
[27188.889295]  ? syscall_exit_to_user_mode+0x22/0x40
[27188.889304]  ? do_syscall_64+0x65/0xb0
[27188.889312]  ? iterate_dir+0x118/0x170
[27188.889320]  ? __x64_sys_getdents64+0x10a/0x130
[27188.889328]  ? __pfx_filldir64+0x10/0x10
[27188.889336]  ? syscall_exit_to_user_mode+0x22/0x40
[27188.889345]  ? do_syscall_64+0x65/0xb0
[27188.889353]  ? syscall_exit_to_user_mode+0x22/0x40
[27188.889362]  ? do_syscall_64+0x65/0xb0
[27188.889369]  ? syscall_exit_to_user_mode+0x22/0x40
[27188.889378]  ? do_syscall_64+0x65/0xb0
[27188.889385]  ? __irq_exit_rcu+0x3b/0xc0
[27188.889396]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27188.889408] RIP: 0033:0x7f2af91bfc5b
[27188.889417] RSP: 002b:00007ffc63e02fa0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27188.889428] RAX: ffffffffffffffda RBX: 0000000005459810 RCX: 00007f2af91bfc5b
[27188.889433] RDX: 00007ffc63e03020 RSI: 0000000000005a05 RDI: 000000000000001d
[27188.889438] RBP: 00007ffc63e06610 R08: 00007f2af92953d0 R09: 00007f2af92953d0
[27188.889443] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc63e03020
[27188.889448] R13: 0000000005459810 R14: 00000000058da640 R15: 00007ffc63e06624
[27188.889458]  </TASK>
[27188.889471] INFO: task zpool:18307 blocked for more than 241 seconds.
[27188.890777]       Tainted: P           OE      6.6.44-production+truenas #1
[27188.892075] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27188.893401] task:zpool           state:D stack:0     pid:18307 ppid:1      flags:0x00004006
[27188.893414] Call Trace:
[27188.893419]  <TASK>
[27188.893425]  __schedule+0x349/0x950
[27188.893439]  schedule+0x5b/0xa0
[27188.893447]  io_schedule+0x46/0x70
[27188.893457]  cv_wait_common+0xaa/0x130 [spl]
[27188.893519]  ? __pfx_autoremove_wake_function+0x10/0x10
[27188.893531]  txg_wait_synced_impl+0xc0/0x110 [zfs]
[27188.894678]  txg_wait_synced+0x10/0x40 [zfs]
[27188.895790]  spa_load_impl.constprop.0+0x3ff/0x5b0 [zfs]
[27188.896910]  spa_load+0x73/0x120 [zfs]
[27188.898028]  spa_load_best+0x54/0x250 [zfs]
[27188.899193]  spa_import+0x231/0x690 [zfs]
[27188.900331]  zfs_ioc_pool_import+0x15b/0x180 [zfs]
[27188.901409]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27188.902531]  ? __kmalloc_node+0xc6/0x150
[27188.902551]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27188.903528]  __x64_sys_ioctl+0x97/0xd0
[27188.903542]  do_syscall_64+0x59/0xb0
[27188.903552]  ? syscall_exit_to_user_mode+0x22/0x40
[27188.903560]  ? do_syscall_64+0x65/0xb0
[27188.903568]  ? __irq_exit_rcu+0x3b/0xc0
[27188.903578]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27188.903588] RIP: 0033:0x7fe57247cc5b
[27188.903598] RSP: 002b:00007ffcd7eb4cf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27188.903607] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe57247cc5b
[27188.903612] RDX: 00007ffcd7eb4db0 RSI: 0000000000005a02 RDI: 0000000000000003
[27188.903616] RBP: 00007ffcd7eb94a0 R08: 00007fe572552440 R09: 00007fe572552440
[27188.903621] R10: 0000000000000000 R11: 0000000000000246 R12: 0000558b46d82dd0
[27188.903625] R13: 00007ffcd7eb4db0 R14: 00007fe558003df0 R15: 0000558b46e266c0
[27188.903634]  </TASK>
[27188.903658] INFO: task zpool:18510 blocked for more than 120 seconds.
[27188.904894]       Tainted: P           OE      6.6.44-production+truenas #1
[27188.906111] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27188.907379] task:zpool           state:D stack:0     pid:18510 ppid:1      flags:0x00000006
[27188.907392] Call Trace:
[27188.907396]  <TASK>
[27188.907402]  __schedule+0x349/0x950
[27188.907415]  schedule+0x5b/0xa0
[27188.907425]  cv_wait_common+0xf0/0x130 [spl]
[27188.907481]  ? __pfx_autoremove_wake_function+0x10/0x10
[27188.907492]  spa_lookup+0x51/0x140 [zfs]
[27188.908401]  spa_open_common+0x79/0x440 [zfs]
[27188.909283]  spa_get_stats+0x4e/0x210 [zfs]
[27188.910162]  ? spl_kmem_alloc_impl+0xb4/0xf0 [spl]
[27188.910215]  zfs_ioc_pool_stats+0x40/0x90 [zfs]
[27188.911105]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27188.911923]  ? __kmalloc_node+0xc6/0x150
[27188.911937]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27188.912691]  __x64_sys_ioctl+0x97/0xd0
[27188.912702]  do_syscall_64+0x59/0xb0
[27188.912711]  ? __slab_free.constprop.0+0xe2/0x280
[27188.912722]  ? __alloc_pages+0x1a1/0x350
[27188.912729]  ? __mod_memcg_lruvec_state+0x4e/0xa0
[27188.912737]  ? __mod_lruvec_page_state+0x97/0x130
[27188.912742]  ? folio_add_new_anon_rmap+0x45/0xe0
[27188.912749]  ? set_ptes.constprop.0+0x1e/0xa0
[27188.912755]  ? do_anonymous_page+0x35d/0x410
[27188.912761]  ? __handle_mm_fault+0xbf1/0xd90
[27188.912769]  ? __count_memcg_events+0x4d/0x90
[27188.912774]  ? count_memcg_events.constprop.0+0x1a/0x30
[27188.912782]  ? handle_mm_fault+0xa2/0x370
[27188.912788]  ? do_user_addr_fault+0x323/0x630
[27188.912795]  ? exc_page_fault+0x77/0x170
[27188.912801]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27188.912809] RIP: 0033:0x7fd7ee96fc5b
[27188.912817] RSP: 002b:00007ffda3c974f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27188.912823] RAX: ffffffffffffffda RBX: 000055cc35e38410 RCX: 00007fd7ee96fc5b
[27188.912827] RDX: 00007ffda3c97570 RSI: 0000000000005a05 RDI: 0000000000000003
[27188.912830] RBP: 00007ffda3c9ab60 R08: 000055cc35e3cd50 R09: 00007fd7eea44d10
[27188.912834] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffda3c97570
[27188.912837] R13: 000055cc35e38410 R14: 000055cc35e31dd0 R15: 00007ffda3c9ab84
[27188.912844]  </TASK>
[27309.710999] INFO: task middlewared (wo:17395 blocked for more than 362 seconds.
[27309.712205]       Tainted: P           OE      6.6.44-production+truenas #1
[27309.713419] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27309.714681] task:middlewared (wo state:D stack:0     pid:17395 ppid:935    flags:0x00000002
[27309.714696] Call Trace:
[27309.714701]  <TASK>
[27309.714710]  __schedule+0x349/0x950
[27309.714729]  schedule+0x5b/0xa0
[27309.714740]  cv_wait_common+0xf0/0x130 [spl]
[27309.714803]  ? __pfx_autoremove_wake_function+0x10/0x10
[27309.714816]  spa_lookup+0x51/0x140 [zfs]
[27309.715940]  spa_open_common+0x79/0x440 [zfs]
[27309.717061]  spa_get_stats+0x4e/0x210 [zfs]
[27309.718178]  ? spl_kmem_alloc_impl+0xb4/0xf0 [spl]
[27309.718246]  zfs_ioc_pool_stats+0x40/0x90 [zfs]
[27309.719384]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27309.720475]  ? __kmalloc_node+0xc6/0x150
[27309.720492]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27309.721531]  __x64_sys_ioctl+0x97/0xd0
[27309.721544]  do_syscall_64+0x59/0xb0
[27309.721554]  ? do_syscall_64+0x65/0xb0
[27309.721562]  ? syscall_exit_to_user_mode+0x22/0x40
[27309.721571]  ? do_syscall_64+0x65/0xb0
[27309.721578]  ? iterate_dir+0x118/0x170
[27309.721587]  ? __x64_sys_getdents64+0x10a/0x130
[27309.721594]  ? __pfx_filldir64+0x10/0x10
[27309.721603]  ? syscall_exit_to_user_mode+0x22/0x40
[27309.721611]  ? do_syscall_64+0x65/0xb0
[27309.721620]  ? syscall_exit_to_user_mode+0x22/0x40
[27309.721628]  ? do_syscall_64+0x65/0xb0
[27309.721636]  ? syscall_exit_to_user_mode+0x22/0x40
[27309.721644]  ? do_syscall_64+0x65/0xb0
[27309.721651]  ? __irq_exit_rcu+0x3b/0xc0
[27309.721662]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27309.721674] RIP: 0033:0x7f2af91bfc5b
[27309.721684] RSP: 002b:00007ffc63e02fa0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27309.721694] RAX: ffffffffffffffda RBX: 0000000005459810 RCX: 00007f2af91bfc5b
[27309.721700] RDX: 00007ffc63e03020 RSI: 0000000000005a05 RDI: 000000000000001d
[27309.721705] RBP: 00007ffc63e06610 R08: 00007f2af92953d0 R09: 00007f2af92953d0
[27309.721710] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc63e03020
[27309.721714] R13: 0000000005459810 R14: 00000000058da640 R15: 00007ffc63e06624
[27309.721725]  </TASK>
[27309.721738] INFO: task zpool:18307 blocked for more than 362 seconds.
[27309.723050]       Tainted: P           OE      6.6.44-production+truenas #1
[27309.724353] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27309.725665] task:zpool           state:D stack:0     pid:18307 ppid:1      flags:0x00004006
[27309.725679] Call Trace:
[27309.725683]  <TASK>
[27309.725689]  __schedule+0x349/0x950
[27309.725703]  schedule+0x5b/0xa0
[27309.725711]  io_schedule+0x46/0x70
[27309.725721]  cv_wait_common+0xaa/0x130 [spl]
[27309.725782]  ? __pfx_autoremove_wake_function+0x10/0x10
[27309.725794]  txg_wait_synced_impl+0xc0/0x110 [zfs]
[27309.726940]  txg_wait_synced+0x10/0x40 [zfs]
[27309.727928]  spa_load_impl.constprop.0+0x3ff/0x5b0 [zfs]
[27309.728887]  spa_load+0x73/0x120 [zfs]
[27309.729848]  spa_load_best+0x54/0x250 [zfs]
[27309.730845]  spa_import+0x231/0x690 [zfs]
[27309.731708]  zfs_ioc_pool_import+0x15b/0x180 [zfs]
[27309.732491]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27309.733269]  ? __kmalloc_node+0xc6/0x150
[27309.733282]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27309.734130]  __x64_sys_ioctl+0x97/0xd0
[27309.734152]  do_syscall_64+0x59/0xb0
[27309.734161]  ? syscall_exit_to_user_mode+0x22/0x40
[27309.734168]  ? do_syscall_64+0x65/0xb0
[27309.734174]  ? __irq_exit_rcu+0x3b/0xc0
[27309.734183]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27309.734191] RIP: 0033:0x7fe57247cc5b
[27309.734198] RSP: 002b:00007ffcd7eb4cf0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27309.734204] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fe57247cc5b
[27309.734208] RDX: 00007ffcd7eb4db0 RSI: 0000000000005a02 RDI: 0000000000000003
[27309.734212] RBP: 00007ffcd7eb94a0 R08: 00007fe572552440 R09: 00007fe572552440
[27309.734215] R10: 0000000000000000 R11: 0000000000000246 R12: 0000558b46d82dd0
[27309.734219] R13: 00007ffcd7eb4db0 R14: 00007fe558003df0 R15: 0000558b46e266c0
[27309.734225]  </TASK>
[27309.734245] INFO: task zpool:18510 blocked for more than 241 seconds.
[27309.735313]       Tainted: P           OE      6.6.44-production+truenas #1
[27309.736257] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27309.737193] task:zpool           state:D stack:0     pid:18510 ppid:1      flags:0x00000006
[27309.737200] Call Trace:
[27309.737203]  <TASK>
[27309.737206]  __schedule+0x349/0x950
[27309.737214]  schedule+0x5b/0xa0
[27309.737221]  cv_wait_common+0xf0/0x130 [spl]
[27309.737259]  ? __pfx_autoremove_wake_function+0x10/0x10
[27309.737266]  spa_lookup+0x51/0x140 [zfs]
[27309.737963]  spa_open_common+0x79/0x440 [zfs]
[27309.738689]  spa_get_stats+0x4e/0x210 [zfs]
[27309.739394]  ? spl_kmem_alloc_impl+0xb4/0xf0 [spl]
[27309.739434]  zfs_ioc_pool_stats+0x40/0x90 [zfs]
[27309.740085]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27309.740743]  ? __kmalloc_node+0xc6/0x150
[27309.740755]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27309.741409]  __x64_sys_ioctl+0x97/0xd0
[27309.741419]  do_syscall_64+0x59/0xb0
[27309.741426]  ? __slab_free.constprop.0+0xe2/0x280
[27309.741435]  ? __alloc_pages+0x1a1/0x350
[27309.741441]  ? __mod_memcg_lruvec_state+0x4e/0xa0
[27309.741447]  ? __mod_lruvec_page_state+0x97/0x130
[27309.741451]  ? folio_add_new_anon_rmap+0x45/0xe0
[27309.741456]  ? set_ptes.constprop.0+0x1e/0xa0
[27309.741461]  ? do_anonymous_page+0x35d/0x410
[27309.741466]  ? __handle_mm_fault+0xbf1/0xd90
[27309.741473]  ? __count_memcg_events+0x4d/0x90
[27309.741477]  ? count_memcg_events.constprop.0+0x1a/0x30
[27309.741483]  ? handle_mm_fault+0xa2/0x370
[27309.741488]  ? do_user_addr_fault+0x323/0x630
[27309.741493]  ? exc_page_fault+0x77/0x170
[27309.741499]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27309.741505] RIP: 0033:0x7fd7ee96fc5b
[27309.741511] RSP: 002b:00007ffda3c974f0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27309.741517] RAX: ffffffffffffffda RBX: 000055cc35e38410 RCX: 00007fd7ee96fc5b
[27309.741520] RDX: 00007ffda3c97570 RSI: 0000000000005a05 RDI: 0000000000000003
[27309.741522] RBP: 00007ffda3c9ab60 R08: 000055cc35e3cd50 R09: 00007fd7eea44d10
[27309.741525] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffda3c97570
[27309.741528] R13: 000055cc35e38410 R14: 000055cc35e31dd0 R15: 00007ffda3c9ab84
[27309.741533]  </TASK>
[27430.543730] INFO: task middlewared (wo:17395 blocked for more than 483 seconds.
[27430.544921]       Tainted: P           OE      6.6.44-production+truenas #1
[27430.546111] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[27430.547364] task:middlewared (wo state:D stack:0     pid:17395 ppid:935    flags:0x00000002
[27430.547380] Call Trace:
[27430.547385]  <TASK>
[27430.547393]  __schedule+0x349/0x950
[27430.547413]  schedule+0x5b/0xa0
[27430.547424]  cv_wait_common+0xf0/0x130 [spl]
[27430.547486]  ? __pfx_autoremove_wake_function+0x10/0x10
[27430.547500]  spa_lookup+0x51/0x140 [zfs]
[27430.548646]  spa_open_common+0x79/0x440 [zfs]
[27430.549787]  spa_get_stats+0x4e/0x210 [zfs]
[27430.550907]  ? spl_kmem_alloc_impl+0xb4/0xf0 [spl]
[27430.550975]  zfs_ioc_pool_stats+0x40/0x90 [zfs]
[27430.552118]  zfsdev_ioctl_common+0x680/0x790 [zfs]
[27430.553230]  ? __kmalloc_node+0xc6/0x150
[27430.553246]  zfsdev_ioctl+0x53/0xe0 [zfs]
[27430.554289]  __x64_sys_ioctl+0x97/0xd0
[27430.554304]  do_syscall_64+0x59/0xb0
[27430.554313]  ? do_syscall_64+0x65/0xb0
[27430.554321]  ? syscall_exit_to_user_mode+0x22/0x40
[27430.554330]  ? do_syscall_64+0x65/0xb0
[27430.554337]  ? iterate_dir+0x118/0x170
[27430.554346]  ? __x64_sys_getdents64+0x10a/0x130
[27430.554353]  ? __pfx_filldir64+0x10/0x10
[27430.554362]  ? syscall_exit_to_user_mode+0x22/0x40
[27430.554370]  ? do_syscall_64+0x65/0xb0
[27430.554379]  ? syscall_exit_to_user_mode+0x22/0x40
[27430.554387]  ? do_syscall_64+0x65/0xb0
[27430.554394]  ? syscall_exit_to_user_mode+0x22/0x40
[27430.554403]  ? do_syscall_64+0x65/0xb0
[27430.554410]  ? __irq_exit_rcu+0x3b/0xc0
[27430.554421]  entry_SYSCALL_64_after_hwframe+0x78/0xe2
[27430.554432] RIP: 0033:0x7f2af91bfc5b
[27430.554442] RSP: 002b:00007ffc63e02fa0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[27430.554452] RAX: ffffffffffffffda RBX: 0000000005459810 RCX: 00007f2af91bfc5b
[27430.554458] RDX: 00007ffc63e03020 RSI: 0000000000005a05 RDI: 000000000000001d
[27430.554463] RBP: 00007ffc63e06610 R08: 00007f2af92953d0 R09: 00007f2af92953d0
[27430.554468] R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffc63e03020
[27430.554473] R13: 0000000005459810 R14: 00000000058da640 R15: 00007ffc63e06624
[27430.554483]  </TASK>
[27430.554487] Future hung task reports are suppressed, see sysctl kernel.hung_task_warnings

First of all - those Barracuda drives are SMR rather than CMR drives. Not Good
Secondly - You were using a PCIe to SATA adapter - which is also almost certainly a bad idea. (some are OK - most are not)

Personally I think you have lost the pool and I guess from what you are asking you don’t have a backup.

The drives themselves look OK (other than the two SMR, which should be not so carefully filed under junk in the round filing cabinet)

Where did you get the LSI from and have you flashed the correct software onto it?

1 Like