Unable to import a pool after power outage PLEASE HELP

Hi everyone,

After after a power loss I’m unable to import a pool. Here is the error I get.

truenas_admin@truenas[~]$ sudo zpool import
  pool: pool2
    id: 3548847501627557152
 state: FAULTED
status: The pool metadata is corrupted.
action: The pool cannot be imported due to damaged devices or data.
	The pool may be active on another system, but can be imported using
	the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
config:

	pool2                                  FAULTED  corrupted data
	  mirror-0                                ONLINE
	    55a4702b-04cc-11eb-8e7a-a0481c79a45e  ONLINE
	    57c996c9-04cc-11eb-8e7a-a0481c79a45e  ONLINE
truenas_admin@truenas[~]$ 

The other two pools are OK.

Here is the output of lsblk

truenas_admin@truenas[~]$ sudo lsblk -bo NAME,MODEL,ROTA,PTTYPE,TYPE,START,SIZE,PARTTYPENAME,PARTUUID

NAME       MODEL                    ROTA PTTYPE TYPE   START           SIZE PARTTYPENAME             PARTUUID
sda        WDC WD30EZRX-00MMMB0        1 gpt    disk          3000558944256                          
├─sda1                                 1 gpt    part     128     2147483648 FreeBSD swap             3503569d-4456-11ea-b4b1-a0481c79a45e
└─sda2                                 1 gpt    part 4194432  2998411374592 FreeBSD ZFS              35491d87-4456-11ea-b4b1-a0481c79a45e
sdb        WDC WD120EMFZ-11A6JA0       1 gpt    disk         12000105070592                          
├─sdb1                                 1 gpt    part     128     2147483648 FreeBSD swap             55749886-04cc-11eb-8e7a-a0481c79a45e
└─sdb2                                 1 gpt    part 4194432 11997957500928 FreeBSD ZFS              55a4702b-04cc-11eb-8e7a-a0481c79a45e
sdc        WDC WD30EZRX-00MMMB0        1 gpt    disk          3000558944256                          
├─sdc1                                 1 gpt    part     128     2147483648 FreeBSD swap             326af7b1-4456-11ea-b4b1-a0481c79a45e
└─sdc2                                 1 gpt    part 4194432  2998411374592 FreeBSD ZFS              32b17b6a-4456-11ea-b4b1-a0481c79a45e
sdd        WDC WD120EMFZ-11A6JA0       1 gpt    disk         12000105070592                          
├─sdd1                                 1 gpt    part     128     2147483648 FreeBSD swap             5799ae68-04cc-11eb-8e7a-a0481c79a45e
└─sdd2                                 1 gpt    part 4194432 11997957500928 FreeBSD ZFS              57c996c9-04cc-11eb-8e7a-a0481c79a45e
sde        WDC WD80EMZZ-11B4FB0        1 gpt    disk          8001529315328                          
├─sde1                                 1 gpt    part     128     2147483648 FreeBSD swap             f8513eae-78fc-11ee-b0f4-a0481c79a45e
└─sde2                                 1 gpt    part 4194432  7999381745664 FreeBSD ZFS              f895589c-78fc-11ee-b0f4-a0481c79a45e
sdf        WDC WD80EMZZ-11B4FB0        1 gpt    disk          8001529315328                          
├─sdf1                                 1 gpt    part     128     2147483648 FreeBSD swap             f8727f08-78fc-11ee-b0f4-a0481c79a45e
└─sdf2                                 1 gpt    part 4194432  7999381745664 FreeBSD ZFS              f8b3e3d3-78fc-11ee-b0f4-a0481c79a45e
sr0        HL-DT-ST DVD+/-RW GU90N     0        rom              1073741312                          
nvme0n1    PM9B1 NVMe Samsung 512GB    0 gpt    disk           512110190592                          
├─nvme0n1p1
│                                      0 gpt    part    4096        1048576 BIOS boot                0020a027-5d98-4191-bcfe-0d0da7732eee
├─nvme0n1p2
│                                      0 gpt    part    6144      536870912 EFI System               60634f8b-c3b9-4632-a792-19ad460b1190
└─nvme0n1p3
                                       0 gpt    part 1054720   511570157056 Solaris /usr & Apple ZFS 07df006c-fe15-427d-b25f-ac41e4c7dcdf

the output of lspci

truenas_admin@truenas[~]$ lspci 


00:00.0 Host bridge: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers (rev 07)
00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630]
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10)
00:1c.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #5 (rev f0)
00:1d.0 PCI bridge: Intel Corporation Cannon Lake PCH PCI Express Root Port #9 (rev f0)
00:1f.0 ISA bridge: Intel Corporation H370 Chipset LPC/eSPI Controller (rev 10)
00:1f.3 Audio device: Intel Corporation Cannon Lake PCH cAVS (rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
00:1f.5 Serial bus controller: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)
02:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD Controller PM9B1 (rev 02)
truenas_admin@truenas[~]$ 

the following commands also didn’t work

sudo zpool import -f -o readonly=on pool2
sudo zpool import -F -n pool2
sudo zpool import -fFX pool2             // Unfortenately using -X didn't work

here is the output from zdb

truenas_admin@truenas[~]$ zdb
zsh: command not found: zdb
truenas_admin@truenas[~]$ sudo zdb
boot-pool:
    version: 5000
    name: 'boot-pool'
    state: 0
    txg: 1336044
    pool_guid: 1713755422868669731
    errata: 0
    compatibility: 'grub2'
    hostname: '(none)'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 1713755422868669731
        create_txg: 4
        children[0]:
            type: 'disk'
            id: 0
            guid: 2047319333662842187
            path: '/dev/nvme0n1p3'
            whole_disk: 0
            metaslab_array: 256
            metaslab_shift: 32
            ashift: 12
            asize: 511565365248
            is_log: 0
            DTL: 6964
            create_txg: 4
            com.delphix:vdev_zap_leaf: 129
            com.delphix:vdev_zap_top: 130
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data
pool2:
    version: 5000
    name: 'pool2'
    state: 0
    txg: 31418510
    pool_guid: 3548847501627557152
    errata: 0
    hostid: 938851210
    hostname: 'truenas'
    com.delphix:has_per_vdev_zaps
    vdev_children: 1
    vdev_tree:
        type: 'root'
        id: 0
        guid: 3548847501627557152
        create_txg: 4
        children[0]:
            type: 'mirror'
            id: 0
            guid: 11631572484601031782
            metaslab_array: 40
            metaslab_shift: 36
            ashift: 12
            asize: 11997952606208
            is_log: 0
            create_txg: 4
            com.delphix:vdev_zap_top: 36
            children[0]:
                type: 'disk'
                id: 0
                guid: 10626764901365296421
                path: '/dev/disk/by-partuuid/55a4702b-04cc-11eb-8e7a-a0481c79a45e'
                devid: 'ata-WDC_WD120EMFZ-11A6JA0_Z2KWYLNT-part2'
                phys_path: 'pci-0000:00:14.0-usb-0:1:1.0-scsi-0:0:0:0'
                whole_disk: 1
                DTL: 167
                create_txg: 4
                com.delphix:vdev_zap_leaf: 37
            children[1]:
                type: 'disk'
                id: 1
                guid: 39938950378250001
                path: '/dev/disk/by-partuuid/57c996c9-04cc-11eb-8e7a-a0481c79a45e'
                devid: 'ata-WDC_WD120EMFZ-11A6JA0_QGHNUJHT-part2'
                phys_path: 'pci-0000:00:14.0-usb-0:2:1.0-scsi-0:0:0:0'
                whole_disk: 1
                DTL: 166
                create_txg: 4
                com.delphix:vdev_zap_leaf: 38
    features_for_read:
        com.delphix:hole_birth
        com.delphix:embedded_data

ZFS_DBGMSG(zdb) START:
metaslab.c:1687:spa_set_allocator(): spa allocator: dynamic
metaslab.c:1687:spa_set_allocator(): spa allocator: dynamic
ZFS_DBGMSG(zdb) END
truenas_admin@truenas[~]$ 


I run smartctl on both disks /dev/sdd and /dev/sdb which pool2 is using and still nothing.

ruenas_admin@truenas[/var/log]$ sudo smartctl -a /dev/sdb
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red (CMR)
Device Model:     WDC WD120EMFZ-11A6JA0
Serial Number:    Z2KWYLNT
LU WWN Device Id: 5 000cca 28df6fba7
Firmware Version: 81.00A81
User Capacity:    12,000,138,625,024 bytes [12.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/6028
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Dec 24 05:54:44 2025 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 241)	Self-test routine in progress...
					10% of test remaining.
Total time to complete Offline 
data collection: 		(  101) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (1270) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   001    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   135   135   054    Old_age   Offline      -       108
  3 Spin_Up_Time            0x0007   081   081   001    Pre-fail  Always       -       400 (Average 374)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       240
  5 Reallocated_Sector_Ct   0x0033   100   100   001    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   001    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   133   133   020    Old_age   Offline      -       18
  9 Power_On_Hours          0x0012   094   094   000    Old_age   Always       -       44052
 10 Spin_Retry_Count        0x0012   100   100   001    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   097   097   000    Old_age   Always       -       240
 22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2047
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       2047
194 Temperature_Celsius     0x0002   010   010   000    Old_age   Always       -       59 (Min/Max 20/67)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     44051         -
# 2  Short offline       Completed without error       00%     44006         -
# 3  Extended offline    Aborted by host               10%     44006         -
# 4  Short offline       Completed without error       00%     44004         -
# 5  Short offline       Completed without error       00%     43984         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

truenas_admin@truenas[/var/log]$ 

and

truenas_admin@truenas[/var/log]$ sudo smartctl -a /dev/sdd
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Red (CMR)
Device Model:     WDC WD120EMFZ-11A6JA0
Serial Number:    QGHNUJHT
LU WWN Device Id: 5 000cca 29bd78ec5
Firmware Version: 81.00A81
User Capacity:    12,000,138,625,024 bytes [12.0 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5400 rpm
Form Factor:      3.5 inches
Device is:        In smartctl database 7.3/6028
ATA Version is:   ACS-2, ATA8-ACS T13/1699-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Wed Dec 24 05:55:21 2025 PST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82)	Offline data collection activity
					was completed without error.
					Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0)	The previous self-test routine completed
					without error or no self-test has ever 
					been run.
Total time to complete Offline 
data collection: 		(  101) seconds.
Offline data collection
capabilities: 			 (0x5b) SMART execute Offline immediate.
					Auto Offline data collection on/off support.
					Suspend Offline collection upon new
					command.
					Offline surface scan supported.
					Self-test supported.
					No Conveyance Self-test supported.
					Selective Self-test supported.
SMART capabilities:            (0x0003)	Saves SMART data before entering
					power-saving mode.
					Supports SMART auto save timer.
Error logging capability:        (0x01)	Error logging supported.
					General Purpose Logging supported.
Short self-test routine 
recommended polling time: 	 (   2) minutes.
Extended self-test routine
recommended polling time: 	 (1196) minutes.
SCT capabilities: 	       (0x003d)	SCT Status supported.
					SCT Error Recovery Control supported.
					SCT Feature Control supported.
					SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000b   100   100   001    Pre-fail  Always       -       0
  2 Throughput_Performance  0x0004   135   135   054    Old_age   Offline      -       108
  3 Spin_Up_Time            0x0007   081   081   001    Pre-fail  Always       -       405 (Average 373)
  4 Start_Stop_Count        0x0012   100   100   000    Old_age   Always       -       229
  5 Reallocated_Sector_Ct   0x0033   100   100   001    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000a   100   100   001    Old_age   Always       -       0
  8 Seek_Time_Performance   0x0004   133   133   020    Old_age   Offline      -       18
  9 Power_On_Hours          0x0012   094   094   000    Old_age   Always       -       44051
 10 Spin_Retry_Count        0x0012   100   100   001    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   097   097   000    Old_age   Always       -       229
 22 Helium_Level            0x0023   100   100   025    Pre-fail  Always       -       100
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       2035
193 Load_Cycle_Count        0x0012   100   100   000    Old_age   Always       -       2035
194 Temperature_Celsius     0x0002   016   016   000    Old_age   Always       -       55 (Min/Max 19/65)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       0
197 Current_Pending_Sector  0x0022   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0008   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x000a   100   100   000    Old_age   Always       -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     44048         -
# 2  Short offline       Completed without error       00%     44005         -
# 3  Extended offline    Aborted by host               10%     44005         -
# 4  Short offline       Completed without error       00%     44002         -
# 5  Short offline       Completed without error       00%     43982         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

truenas_admin@truenas[/var/log]$ 

I still get the same error message:

truenas_admin@truenas[~]$ sudo zpool import
  pool: pool2
    id: 3548847501627557152
 state: FAULTED
status: The pool metadata is corrupted.
action: The pool cannot be imported due to damaged devices or data.
	The pool may be active on another system, but can be imported using
	the '-f' flag.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
config:

	pool2                                  FAULTED  corrupted data
	  mirror-0                                ONLINE
	    55a4702b-04cc-11eb-8e7a-a0481c79a45e  ONLINE
	    57c996c9-04cc-11eb-8e7a-a0481c79a45e  ONLINE
truenas_admin@truenas[~]$ 

I still don’t understand how this could happen. Running mirrored disks is supposed to be save and robust to issues in any one of the drives.

Is it really possible that both disks died in the same time (after a power event) and in the same time the other two pools are healthy….

I’m really struggling with this for the last 3 days….

Can someone please help???

Welcome to the TrueNAS forums!

I can’t answer the question of how to help. Seems you’ve done a lot to attempt recovery.

Perhaps someone else will have a suggestion.

However, if I am reading the ZDB output correctly, both your disks in “pool2” are wired via USB:

                phys_path: 'pci-0000:00:14.0-usb-0:1:1.0-scsi-0:0:0:0'
                phys_path: 'pci-0000:00:14.0-usb-0:2:1.0-scsi-0:0:0:0'

This is likely the root cause of the metadata corruption:

Many times USB attached disks seem to work, except when they don’t. Especially like during a power fail.


Here is something I wrote up on ZFS and power fails:

Hi Arwen,

Thank you for sharing the above. Very interesting and indeed logical.

To answer your questions. Yes - Both disks are USB external drives but so are the other four which work perfectly.

Would it be safe to plug the disks in different USB connections? Would that eliminate the possibility of 2 USB ports being faulty due to power loss?

Should I shut down the machine swap the ports and boot again? Is that considered to be safe enough?

I’m honestly running out of ideas.

Hi everyone.

I managed to make some progress.

I disabled data and metadata verification and imported pool

sudo echo 0 > /sys/module/zfs/parameters/spa_load_verify_data

sudo echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata

sudo zpool import -fFX pool2

after this the pool became available and it showed in GUI but when I click on one (and only) of its datasets i get the following error:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/api/base/server/ws_handler/rpc.py", line 323, in process_method_call
    result = await method.call(app, params)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/api/base/server/method.py", line 52, in call
    result = await self.middleware.call_with_audit(self.name, self.serviceobj, methodobj, params, app)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 911, in call_with_audit
    result = await self._call(method, serviceobj, methodobj, params, app=app,
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 731, in _call
    return await self.run_in_executor(prepared_call.executor, methodobj, *prepared_call.args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 624, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/api/base/decorator.py", line 101, in wrapped
    result = func(*args)
             ^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/filesystem.py", line 393, in stat
    raise CallError(f'Path {_path} not found', errno.ENOENT)
middlewared.service_exception.CallError: [ENOENT] Path /pool2/TV not found


scrub is under way … 48% done

but what is very strange… when I list contents of the pool

ls /mnt/pool2

I get NOTHING.

Empty!!!

That’s the problem with a lot of not recommended setups; they’ll work just fine initially. The recommendations are not just for what works, but what has the best chance of recovery when something goes wrong.

Completely useless information in terms of helping you recover anything right now, but maybe consider an hba in the near future to reduce the chance of this happening again.

1 Like

Connecting drives via USB is generally considered risky for ZFS, especially in the event of a power loss.
Many USB-to-SATA bridge chips use internal write buffering and may acknowledge write completion before the data has actually been committed to stable storage. In other words, they do not always strictly honor flush/FUA semantics. In a power-loss scenario, this can result in inconsistent on-disk metadata.

This behavior effectively breaks the assumptions ZFS relies on. ZFS’s copy-on-write design depends on the guarantee that completed writes are truly durable. If the underlying device reports writes as complete when they are still buffered, ZFS is misled about the on-disk state, which undermines one of its core data-integrity mechanisms.

In theory, connecting each drive of a mirror to a different USB controller or hub could reduce the likelihood of both sides being affected simultaneously, and in some cases it might help. However, this is not something I would rely on for a ZFS pool, as the fundamental reliability issues of USB storage bridges remain.

1 Like

Sorry, but saying they worked perfectly is meaningless. Yes, they can work fine, even for years. Some people never have trouble with ZFS on USB attached disks. I personally use USB attached disks, WITH ZFS, for my backups. However, those backup disks are not permanently wired: only for the duration of the backup.

This is all about chance events. Including writes that maybe occurring during the power fail. Using reliable hardware, (aka not USB), means you get reliable behavior.

Normally ZFS won’t care about different ports on a server for it’s storage. However, the word “safe” when used with ZFS on USB attached disks is NOT safe.

This is normal because you imported the pool from the command line without the additional option of -R /mnt. Now you should wait until the scrub is complete before any further changes.

The various errors you listed are likely a result from leaving off -R /mnt for the import. In general, using command line for major changes, like pool import, does confuse the TrueNAS GUI. General practice is to import the pool command line, fix any problems, export the pool from the command line. Then the GUI pool import will work fine.

Sorry I don’t have better news. It is just that USB attached disks were never designed to be data center reliable. But ZFS was designed to be fully Enterprise Data Center reliable.

2 Likes

Looks like the scrub did not find any issues at all.

I rebooted the machine

and everything works as previous.

For anyone experiencing the same problem the trick was to:

sudo echo 0 > /sys/module/zfs/parameters/spa_load_verify_data

sudo echo 0 > /sys/module/zfs/parameters/spa_load_verify_metadata

sudo zpool import -fFX pool2

let the scrub run (in my case about 4.5 hours for 12GB pool) and then export

sudo zpool export pool2

sudo echo 1 > /sys/module/zfs/parameters/spa_load_verify_data

sudo echo 1 > /sys/module/zfs/parameters/spa_load_verify_metadata

and reboot!

after that everything should work

PS

Don;’t forget to set the validation flags back to 1

2 Likes

Congrats man, with all of our doom & gloom, you still got it working!

3 Likes