TrueNAS Core: Pool Offline and Disks automatically removed from the Pool

Hi,
My technical skills (specifically in Linux/Unix type systems) are almost zero. Therefore, I’ll need you to be very patient with me as I may ask questions that may seems very basic :-).

Original Issue: I have 4 (seagate 4TB) disks in my TruNAS system. Few weeks ago I got errors regarding the degraded pool. After reading few posts, I felt that changing SATA cable will be the way to go. So I bought a new cable and replaced it. The system started to work fine and all the 4 disks (including the pool) was online. Everything seemed to be working fine. However, since last few days, I wasn’t able to access data (through my windows laptop). So I logged in through web interface and saw that the pool is offline and one of the disk is missing from the disks lists. The other 3 disks were also shown as part of no pool. After which, I did lots of google search and lots of reading, and tried few solutions (that seemed to have that wont cause data loss). After trying many of those solutions, the pool status is still offline, and the disks are still not part of the pool.

Current Status of the Issue: Among all the solutions, a solution by @Arwen worked for me to some extent, that is, when I run this command it shows up the missing (4th disk). However, after a while the disk goes away from the list and the console shows the same error 6 messages.
However, the pool remains offline and the disks still shows to be part of no pool (see the screenshots).
Below are the commands that helped in bringing the 4th disk to the list of disks:

zpool import -Fn -R /mnt All4TBDisks

when the above command is executed, the console shows some message like
solaris: an uncorretable I/O failure …

About the system: This system is a PC based system, which was built by some IT person in my area. The system is almost a year old and has been working perfectly fine. The person is still contactable, but he is asking to charge too much to help me fix the issue without data loss. As my data is important and I need it recovered, therefore, I need to get it fixed. I thought I’ll give it a try myself before paying the person to help me fix the issue without data loss.

Please note that I don’t have any backup either :-).

System Specs: I am not sure how to get the detailed specs of the system, but whatever I could get from the GUI, is either in screenshots or pasted below:

It is a normal tower PC.
CPU: Intel(R) Core™ i5-6500 CPU @ 3.20GHz
RAM: 8 GB

TrueNAS:
TrueNAS Core
Version: TrueNAS-13.0-U6.1

Pool screenshot attached
Disks screenshot attached: To my knowledge, the OS is installed on a SATA 2.5 inch SSD, while the 4 NAS disks are Seagate RED 3.5 inch SATA disks.

root@HomeNAS[~]# zpool status -v
  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:15 with 0 errors on Tue Apr 23 03:45:15 2024
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE    

errors: No known data errors
root@HomeNAS[~]# zpool import
   pool: All4TBDisks
     id: 14329411509399635582
  state: DEGRADED
status: One or more devices are missing from the system.
 action: The pool can be imported despite missing or damaged devices.  The
        fault tolerance of the pool may be compromised if imported.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-2Q
 config:

*** for some reason copy from shell isn't copying the pool and disks list, added it as screenshot***

I also see following 2 alerts on GUI (whenever I restart the TrueNAS system):

### CRITICAL
#### Pool All4TBDisks state is OFFLINE: None

### WARNING
#### New ZFS version or feature flags are available for pool(s) All4TBDisks. Upgrading pools is a one-time process that can prevent rolling the system back to an earlier TrueNAS version. It is recommended to read the TrueNAS release notes and confirm you need the new ZFS feature flags before upgrading a pool.

Looking forward for help on this forum. Please let me know if you need additional information. I’ll appreciate if you can send me specific commands to run (for collecting the information you require) or guiding me through the GUI for getting the required information.

Thank you.





Adding a screenshot of the disks lists. This is how the GUI shows the disks. The 4th disk only appears after running the command I mentioned in my original post.

That command won’t actually import the pool. The “n” option lets you check out the state if the pool can be imported.

ZFS does not automatically import pools with missing / UNAVAIL disks, thus the need for the force option. However, you may try the lower case “f” force option which won’t discard most recent write.

As for why your disk is giving you trouble, I don’t know. Drives do fail and on rare occasions, early on it’s normal life span. Just like their are outliers that live way past 5 years without trouble.

2 Likes

Hi @Arwen ,
Thank you for taking out time to help. Before I proceed with this option, may I know what will happen if the 4th disk still doesn’t come up? Will my data become available? Keeping in view, it’ll be 3 disks, will it cause any issues?

Another question would be, how can I remove this faulty disk and replace with another disk? If I just shutdown the system, remove the faulty disk and replace with another 4TB disk (I have a spare 4TB SATA SSD that I can use) and turn on the system, will the system take care of everything (like formating, writing data to this disk as it should etc) automatically or do I need to follow certain steps? Any guide, link or steps you can list will be of great support.

I’ll probably start another thread once the data is accessible on how to move these disk to another NAS (like may be synology or QNAP etc). I like TrueNAS as I am running it on a PC which gives me excellent performance for a fraction of price of these out of box NAS solutions. But keeping in view, I don’t have much clue about the TrueNAS OS (or Linux/UNIX type OS), I feel I am taking unnecessary risk for higher performance and flexibility.

Your pool is RAIDZ1, so your data will still be intact.
You should be able to do the following:

  • Swap out the faulty disk
  • Boot into TN and see if you are able to import/see the pool via GUI
  • If yes, go into “Manage Devices” select the missing disk and click “Replace”
  • If not, force it via CLI, zpool import -f -R /mnt All4TBDisks and check again.
  • If you still can’t see it via GUI, replace the disk via CLI, zpool replace All4TBDisks [OLD_ID] [NEW_DISK (/dev/sdx)] - You should be able to get the ID from zpool status All4TBDisks -v after the disk has been disconnected.
  • After resilver process has finished, export the pool and reimport it via GUI (zpool export All4TBDisks) so that middleware can do it’s thing.

This should work for SCALE at least, @Arwen does this look right for CORE? I’ve no experience with it.

EDIT: I wouldn’t run with this straight away just in-case, let someone more knowledgeable with CORE confirm first :slight_smile:

1 Like

Thank you @essinghigh for such detailed steps. As you mentioned, I’ll wait for Arwen or someone else to also confirm before I try this.

The documentation should have details on how to replace a disk in TrueNAS, (Core or SCALE). I’ve not done it in TrueNAS before, only with ZFS in Solaris or Linux via command line. If you have trouble with the documentation, that in itself is a good reason to follow the documentation. Make it better for your future self and others, by requesting any updates to the disk change documentation.

But, yes, you should be able to both import your pool with a failed disk. And access your data even without that failed disk, (the whole reason for RAID-Z1).

However, you will be running with no redundancy. So your plan to replace the failed HDD with a SSD might work. On rare occasions a replacement drive, like your SSD, might be a tiny bit smaller and thus not suitable for replacing a HDD. You’d have to run a command to check the sizes.

1 Like

@Arwen,
I tried with the “f” option, but I get error that cannot import the pool.


root@HomeNAS[~]# zpool import -f -R /mnt All4TBDisks
cannot import 'All4TBDisks': one or more devices is currently unavailable

I checked the disks and it shows 3 disks only. The 4th disks isn’t visible in the GUI.

After restarting the system and running the command with “f” again, now all 4 disks are visible. However, they are still not attached to the pool (that is, the pool still appears as “N/A”).

Any ideas what else can I try?

I am thinking to temporarily setup a new PC with fresh install of TrueNAS and plug the NAS disks into the new PC. Will the new TrueNAS pickup the disks automatically or I may still need to do some configurations there to utilise the same disks and data?

@Arwen an update, after restarting few times and using the zpool import command with f seems to work.

In the shell, it shows everything as online.

However, in GUI, the pool is still offline and the disks are still not attached to the pool.

This may be because you’re importing via CLI, so middleware may not be aware of the pool import.
It’s interesting that all disks are showing online. Can you show the output of zpool status -v?

1 Like

@essinghigh here is the output of zpool status -v

root@HomeNAS[~]# zpool status -v
  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:15 with 0 errors on Tue Apr 23 03:45:15 2024
config:

not sure why the copy doesn’t copy entire output :-(. Attaching screenshot of the output.

Oh I completely missed this, the pool isn’t actually imported.
action: The pool can be imported using its name or numeric identifier

Can you try importing via the GUI (or if this doesn’t work, zpool import -R /mntAll4TBDisks)? If it doesn’t work try forcing it again, though it shouldn’t be needed if all disks are reporting online.

GUI does not show the pool during import, so I cannot select anything.

I tried with command line and getting error
“cannot import 'All4TBDisks: I/O error”

truenas_zpool_import_error_202405190955

I tried by the numeric number and -R didn’t work, but trying with -f option. It is doing something, will wait for a bit to see if there is any message.

I’d suggest giving it all the time it wants if it’s doing something.

2 Likes

For sure give it time. If you manage to get it imported first thing I’d do is try and move any critical data off via SFTP (as you don’t have any backups).
Then you can try replacing the disk that appears to be having issues (7374a1f2).

zpool offline ALL4TBDisks gptid/xxxx
zpool replace ALL4TBDisks gptid/xxx gptid/xxx_new

1 Like

I plugged in a monitor to the TrueNAS PC and checking there, but nothing (new) is being displayed there. Is there a way to find out if something is happening, like a lot somewhere?

You could ssh in and run top.

1 Like

Thank you, lots of python3 commands, and nothing that I can make up of that is related to the zpool import command.

I am hoping something is happening :slight_smile: