I’m running FreeNAS-11.3-U5 2TB drives x4, only 5.6 TB was in use.
One drive failed and is reported as removed on the Pool. I tried to replace with the new clen drive already in the system but I’m getting error “Error: [EZFS_POOLUNAVAIL] pool I/O is currently suspended“
Tried to reboot the system and failed drive goes online for about a minute and than again Pool UNAVAIL.
If I keep failed drive connected I will lose access to the UI which is weird
I powered off the server and tried to replace failed drive with new one, and now I got Pool UNKNOWN message, status shows nothing.
I’m trying to fix it for over a day now. I tried everything i could find on this and other forums, so far no luck.
Funny thing is that if I connect failed drive to my windows machine I can see the disk in the disk manager and using data recovery tool I can see and brows zfs partition (empty)
Any suggestions will be greatly appreciated, running out of ideas
zpool status -v output when failed disk is connected:
root@freenas:~ # zpool status -v
pool: NasStorage
state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run ‘zpool clear’.
scan: resilvered 0 in 0 days 00:00:08 with 0 errors on Sat Dec 27 11:32:46 2025
config:
It would appear as if you have a 4 disk stripe, i.e. a pool with no redundancy.
With a striped pool, if one drive fails it’s all over. The pool is lost and any data on it unusable.
The only way you can recover from this is if you somehow manage to get your failed drive back up and running again. Tinkering with the drive at this stage is risky because it’s easy to make things worse, but if you do, your priority should be to make an exact clone of it before it dies permanently.
Thank you for your answer, I was afraid that will be the case
Data wasn’t too important that’s why I set it up this way few years back. But recently I temporary shifted some files I would like to get back, but disk failed before I could copy it back
I will try to get the failed disk online, perhaps getting new electronics can help
Replacing the electronics is beyond my skillset. I recommend trying a different cable and maybe a different port first. Those are the relatively easy troubleshooting steps. Typically you would see checksum errors if it’s related to the cable, but you won’t know until you try…
I think, it’s not a good idea to connect a ZFS disk, or a disk that was part of a ZFS pool, to a Windows system. Windows might write metadata like System Volume Information or $Recycle.Bin, which can overwrite blocks ZFS uses for pool metadata and potentially corrupt the pool.
If it is not a cable -related issue, but a failing disk, you maybe could try to clone it first using ddrescue on a Linux system and work with the clone instead of the original.
The total time depends on the drive size and read speed, but the rate you’re seeing is quite low.
If 13 percent took 14 hours, a full copy could easily ! take 100 plus hours.
If you use ddrescue with a logfile, for example by creating it first with
sudo touch /root/ddrescue.log
and then running (for example !)
sudo ddrescue -d -r3 -f /dev/sda /dev/sdb /root/ddrescue.log
you could stop the process anytime with Ctrl+C. When you restart with the same command and logfile, ddrescue should continue from where it left off, only trying the remaining or problematic blocks. But as @neofusion already stated, let it run.
Note:
If possible, try to ensure good cooling for the HDD. If the disk is really failing, you want to provide the most optimal conditions possible to avoid further damage during the rescue.
That’s great, I will leave it running for now. HDD I try to recover is 2TB cloning to another drive, perhaps cloning to image would be faster? This is the command I used.
The logfile should not become very large and its size mainly depends on the number of errors found. The -d parameter enables direct disk access. As far as I know, -d is recommended for failing or dying disks, as it bypasses the OS cache and allows ddrescue to read blocks more accurately. Using -d makes the overall process slower, since caching and readahead are skipped.
Let’s see what kind of miracle ddrescue can achieve… in the end, it all comes down to what can be rescued 1:1 and whether ZFS can reassemble the pool from it.