Hi everyone,
After after a power loss I’m unable to import a pool. Here is the error I get.
truenas_admin@truenas[~]$ sudo zpool import
pool: pool2
id: 3548847501627557152
state: FAULTED
status: The pool metadata is corrupted.
action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
config:
pool2 FAULTED corrupted data
mirror-0 ONLINE
55a4702b-04cc-11eb-8e7a-a0481c79a45e ONLINE
57c996c9-04cc-11eb-8e7a-a0481c79a45e ONLINE
truenas_admin@truenas[~]$
The other two pools are OK.
Here is the output of lsblk
truenas_admin@truenas[~]$ sudo lsblk -bo NAME,MODEL,ROTA,PTTYPE,TYPE,START,SIZE,PARTTYPENAME,PARTUUID
NAME MODEL ROTA PTTYPE TYPE START SIZE PARTTYPENAME PARTUUID
sda WDC WD30EZRX-00MMMB0 1 gpt disk 3000558944256
├─sda1 1 gpt part 128 2147483648 FreeBSD swap 3503569d-4456-11ea-b4b1-a0481c79a45e
└─sda2 1 gpt part 4194432 2998411374592 FreeBSD ZFS 35491d87-4456-11ea-b4b1-a0481c79a45e
sdb WDC WD120EMFZ-11A6JA0 1 gpt disk 12000105070592
├─sdb1 1 gpt part 128 2147483648 FreeBSD swap 55749886-04cc-11eb-8e7a-a0481c79a45e
└─sdb2 1 gpt part 4194432 11997957500928 FreeBSD ZFS 55a4702b-04cc-11eb-8e7a-a0481c79a45e
sdc WDC WD30EZRX-00MMMB0 1 gpt disk 3000558944256
├─sdc1 1 gpt part 128 2147483648 FreeBSD swap 326af7b1-4456-11ea-b4b1-a0481c79a45e
└─sdc2 1 gpt part 4194432 2998411374592 FreeBSD ZFS 32b17b6a-4456-11ea-b4b1-a0481c79a45e
sdd WDC WD120EMFZ-11A6JA0 1 gpt disk 12000105070592
├─sdd1 1 gpt part 128 2147483648 FreeBSD swap 5799ae68-04cc-11eb-8e7a-a0481c79a45e
└─sdd2 1 gpt part 4194432 11997957500928 FreeBSD ZFS 57c996c9-04cc-11eb-8e7a-a0481c79a45e
sde WDC WD80EMZZ-11B4FB0 1 gpt disk 8001529315328
├─sde1 1 gpt part 128 2147483648 FreeBSD swap f8513eae-78fc-11ee-b0f4-a0481c79a45e
└─sde2 1 gpt part 4194432 7999381745664 FreeBSD ZFS f895589c-78fc-11ee-b0f4-a0481c79a45e
sdf WDC WD80EMZZ-11B4FB0 1 gpt disk 8001529315328
├─sdf1 1 gpt part 128 2147483648 FreeBSD swap f8727f08-78fc-11ee-b0f4-a0481c79a45e
└─sdf2 1 gpt part 4194432 7999381745664 FreeBSD ZFS f8b3e3d3-78fc-11ee-b0f4-a0481c79a45e
sr0 HL-DT-ST DVD+/-RW GU90N 0 rom 1073741312
nvme0n1 PM9B1 NVMe Samsung 512GB 0 gpt disk 512110190592
├─nvme0n1p1
│ 0 gpt part 4096 1048576 BIOS boot 0020a027-5d98-4191-bcfe-0d0da7732eee
├─nvme0n1p2
│ 0 gpt part 6144 536870912 EFI System 60634f8b-c3b9-4632-a792-19ad460b1190
└─nvme0n1p3
0 gpt part 1054720 511570157056 Solaris /usr & Apple ZFS 07df006c-fe15-427d-b25f-ac41e4c7dcdf
the output of lspci
truenas_admin@truenas[~]$ lspci
00:00.0 Host bridge: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers (rev 07)
00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630]
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10)
00:1c.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #5 (rev f0)
00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 (rev f0)
00:1f.0 ISA bridge: Intel Corporation H370 Chipset LPC/eSPI Controller (rev 10)
00:1f.3 Audio device: Intel Corporation Cannon Lake PCH cAVS (rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
00:1f.5 Serial bus controller: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9B1 (rev 02)
truenas_admin@truenas[~]$
the following commands also didn’t work
sudo zpool import -f -o readonly=on pool2
sudo zpool import -F -n pool2
sudo zpool import -fFX pool2 // Unfortenately using -X didn't work
here is the output from zdb
truenas_admin@truenas[~]$ zdb
zsh: command not found: zdb
truenas_admin@truenas[~]$ sudo zdb
boot-pool:
version: 5000
name: 'boot-pool'
state: 0
txg: 1336044
pool_guid: 1713755422868669731
errata: 0
compatibility: 'grub2'
hostname: '(none)'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 1713755422868669731
create_txg: 4
children[0]:
type: 'disk'
id: 0
guid: 2047319333662842187
path: '/dev/nvme0n1p3'
whole_disk: 0
metaslab_array: 256
metaslab_shift: 32
ashift: 12
asize: 511565365248
is_log: 0
DTL: 6964
create_txg: 4
com.delphix:vdev_zap_leaf: 129
com.delphix:vdev_zap_top: 130
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
pool2:
version: 5000
name: 'pool2'
state: 0
txg: 31418510
pool_guid: 3548847501627557152
errata: 0
hostid: 938851210
hostname: 'truenas'
com.delphix:has_per_vdev_zaps
vdev_children: 1
vdev_tree:
type: 'root'
id: 0
guid: 3548847501627557152
create_txg: 4
children[0]:
type: 'mirror'
id: 0
guid: 11631572484601031782
metaslab_array: 40
metaslab_shift: 36
ashift: 12
asize: 11997952606208
is_log: 0
create_txg: 4
com.delphix:vdev_zap_top: 36
children[0]:
type: 'disk'
id: 0
guid: 10626764901365296421
path: '/dev/disk/by-partuuid/55a4702b-04cc-11eb-8e7a-a0481c79a45e'
devid: 'ata-WDC_WD120EMFZ-11A6JA0_Z2KWYLNT-part2'
phys_path: 'pci-0000:00:14.0-usb-0:1:1.0-scsi-0:0:0:0'
whole_disk: 1
DTL: 167
create_txg: 4
com.delphix:vdev_zap_leaf: 37
children[1]:
type: 'disk'
id: 1
guid: 39938950378250001
path: '/dev/disk/by-partuuid/57c996c9-04cc-11eb-8e7a-a0481c79a45e'
devid: 'ata-WDC_WD120EMFZ-11A6JA0_QGHNUJHT-part2'
phys_path: 'pci-0000:00:14.0-usb-0:2:1.0-scsi-0:0:0:0'
whole_disk: 1
DTL: 166
create_txg: 4
com.delphix:vdev_zap_leaf: 38
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
ZFS_DBGMSG(zdb) START:
metaslab.c:1687:spa_set_allocator(): spa allocator: dynamic
metaslab.c:1687:spa_set_allocator(): spa allocator: dynamic
ZFS_DBGMSG(zdb) END
truenas_admin@truenas[~]$
I run smartctl on both disks /dev/sdd and /dev/sdb which pool2 is using and still nothing.
ruenas_admin@truenas[/var/log]$ sudo smartctl -a /dev/sdb
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red (CMR)
Device Model: WDC WD120EMFZ-11A6JA0
Serial Number: Z2KWYLNT
LU WWN Device Id: 5 000cca 28df6fba7
Firmware Version: 81.00A81
User Capacity: 12,000,138,625,024 bytes [12.0 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: In smartctl database 7.3/6028
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Dec 24 05:54:44 2025 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 241) Self-test routine in progress...
10% of test remaining.
Total time to complete Offline
data collection: ( 101) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: (1270) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
2 Throughput_Performance 0x0004 135 135 054 Old_age Offline - 108
3 Spin_Up_Time 0x0007 081 081 001 Pre-fail Always - 400 (Average 374)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 240
5 Reallocated_Sector_Ct 0x0033 100 100 001 Pre-fail Always - 0
7 Seek_Error_Rate 0x000a 100 100 001 Old_age Always - 0
8 Seek_Time_Performance 0x0004 133 133 020 Old_age Offline - 18
9 Power_On_Hours 0x0012 094 094 000 Old_age Always - 44052
10 Spin_Retry_Count 0x0012 100 100 001 Old_age Always - 0
12 Power_Cycle_Count 0x0032 097 097 000 Old_age Always - 240
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2047
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 2047
194 Temperature_Celsius 0x0002 010 010 000 Old_age Always - 59 (Min/Max 20/67)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 44051 -
# 2 Short offline Completed without error 00% 44006 -
# 3 Extended offline Aborted by host 10% 44006 -
# 4 Short offline Completed without error 00% 44004 -
# 5 Short offline Completed without error 00% 43984 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
The above only provides legacy SMART information - try 'smartctl -x' for more
truenas_admin@truenas[/var/log]$
and
truenas_admin@truenas[/var/log]$ sudo smartctl -a /dev/sdd
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Model Family: Western Digital Red (CMR)
Device Model: WDC WD120EMFZ-11A6JA0
Serial Number: QGHNUJHT
LU WWN Device Id: 5 000cca 29bd78ec5
Firmware Version: 81.00A81
User Capacity: 12,000,138,625,024 bytes [12.0 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 5400 rpm
Form Factor: 3.5 inches
Device is: In smartctl database 7.3/6028
ATA Version is: ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Dec 24 05:55:21 2025 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 101) seconds.
Offline data collection
capabilities: (0x5b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: (1196) minutes.
SCT capabilities: (0x003d) SCT Status supported.
SCT Error Recovery Control supported.
SCT Feature Control supported.
SCT Data Table supported.
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000b 100 100 001 Pre-fail Always - 0
2 Throughput_Performance 0x0004 135 135 054 Old_age Offline - 108
3 Spin_Up_Time 0x0007 081 081 001 Pre-fail Always - 405 (Average 373)
4 Start_Stop_Count 0x0012 100 100 000 Old_age Always - 229
5 Reallocated_Sector_Ct 0x0033 100 100 001 Pre-fail Always - 0
7 Seek_Error_Rate 0x000a 100 100 001 Old_age Always - 0
8 Seek_Time_Performance 0x0004 133 133 020 Old_age Offline - 18
9 Power_On_Hours 0x0012 094 094 000 Old_age Always - 44051
10 Spin_Retry_Count 0x0012 100 100 001 Old_age Always - 0
12 Power_Cycle_Count 0x0032 097 097 000 Old_age Always - 229
22 Helium_Level 0x0023 100 100 025 Pre-fail Always - 100
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 2035
193 Load_Cycle_Count 0x0012 100 100 000 Old_age Always - 2035
194 Temperature_Celsius 0x0002 016 016 000 Old_age Always - 55 (Min/Max 19/65)
196 Reallocated_Event_Count 0x0032 100 100 000 Old_age Always - 0
197 Current_Pending_Sector 0x0022 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0008 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x000a 100 100 000 Old_age Always - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 44048 -
# 2 Short offline Completed without error 00% 44005 -
# 3 Extended offline Aborted by host 10% 44005 -
# 4 Short offline Completed without error 00% 44002 -
# 5 Short offline Completed without error 00% 43982 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
The above only provides legacy SMART information - try 'smartctl -x' for more
truenas_admin@truenas[/var/log]$
I still get the same error message:
truenas_admin@truenas[~]$ sudo zpool import
pool: pool2
id: 3548847501627557152
state: FAULTED
status: The pool metadata is corrupted.
action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
config:
pool2 FAULTED corrupted data
mirror-0 ONLINE
55a4702b-04cc-11eb-8e7a-a0481c79a45e ONLINE
57c996c9-04cc-11eb-8e7a-a0481c79a45e ONLINE
truenas_admin@truenas[~]$
I still don’t understand how this could happen. Running mirrored disks is supposed to be save and robust to issues in any one of the drives.
Is it really possible that both disks died in the same time (after a power event) and in the same time the other two pools are healthy….
I’m really struggling with this for the last 3 days….
Can someone please help???