I’m quite new to TrueNAS and Linux in general, but I’ve had a very serious problem
Sorry if i post in the incorrect area, im noob here and thanks in advance
I run TrueNAS-SCALE-23.10.2
CPU: i5-9400
RAM: 16gb ddr4
GPU: gt1030
Pool: 2x500GB disk and 1TB disk
It was configured in a raid5
Last night I was going to access my samba server when I saw that I couldn’t access it, I thought the server was off but I saw it on, I went to the panel and it said that there was no pool created, in the storage dashboard section, my pool appeared, but in “offline VDEVs” it appeared, I kept looking and I saw that someone solved it by exporting the pool and then importing it, that didn’t work for me, I get the error “[EZFS_IO] Failed to import ‘data’ pool: cannot import ‘data’ as ‘data’: I/O error” trying more commands I remember that one told me that the pool metadata was corrupt and sent me to an OpenZFS link talking about the error “ZFS-8000-72” I have not been able to do much more, I need to recover the data since there are very important memories for me.
Problem have been reported with the very latest point releases of 24.04 and 24.10, but not as far as I am aware with 23.10.2.
The error message you got was probably from running sudo zpool import and TBH it is NOT a good sign. If your pool has metadata errors, the chances are that you will need to destroy and recreate it and then restore from backup. That said, it may be possible to revert to an earlier commit point or revert to an earlier snapshot for the datasets that have metadata errors or (most likely) to mount it read-only so you can copy your data elsewhere. But we will need to understand the details of your issue before we can help you.
Please run the following diagnostic commands and copy and paste the results (in between lines containing just ``` which will preserve the formatting):
Whatever you do, do NOT run any commands that will make any changes the the disks in question as this could make things worse and reduce the chances of recovering your data.
Thanks.
P.S. If you are new to TrueNAS, roughly speaking when did you install it and why did you install 23.10?
NAME MODEL PTTYPE TYPE START SIZE PARTTYPENAME PARTUUID
sda TOSHIBA MQ01ABD100 gpt disk 1000204886016
├─sda1 gpt part 128 2147483648 FreeBSD swap b22cef3e-ce76-11ee-9fb8-d45d64208664
└─sda2 gpt part 4194432 998057316352 FreeBSD ZFS b2639d11-ce76-11ee-9fb8-d45d64208664
sdb WDC WD5000AAKX-22ERMA0 gpt disk 500107862016
├─sdb1 gpt part 128 2147483648 FreeBSD swap b226e7fc-ce76-11ee-9fb8-d45d64208664
└─sdb2 gpt part 4194432 497960292352 FreeBSD ZFS b25df2bc-ce76-11ee-9fb8-d45d64208664
sdc EMTEC X250 256GB gpt disk 256060514304
├─sdc1 gpt part 40 272629760 EFI System 3bcb9985-ce6f-11ee-b719-d45d64208664
├─sdc2 gpt part 34086952 238605565952 FreeBSD ZFS 3bd3eff9-ce6f-11ee-b719-d45d64208664
└─sdc3 gpt part 532520 17179869184 FreeBSD swap 3bd0a529-ce6f-11ee-b719-d45d64208664
└─sdc3 crypt 17179869184
sdd Hitachi HDS721050CLA362 gpt disk 500107862016
├─sdd1 gpt part 128 2147483648 FreeBSD swap b2112368-ce76-11ee-9fb8-d45d64208664
└─sdd2 gpt part 4194432 497960292352 FreeBSD ZFS b24ce596-ce76-11ee-9fb8-d45d64208664
sde Portable SSD gpt disk 1000204886016
└─sde1 gpt part 2048 1000202788864 Microsoft basic data 92d20b67-e82f-4944-abca-2da5e20fd9f3```
```root@truenas:/# lspci
00:00.0 Host bridge: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers (rev 0d)
00:01.0 PCI bridge: Intel Corporation 6th-10th Gen Core Processor PCIe Controller (x16) (rev 0d)
00:14.0 USB controller: Intel Corporation 200 Series/Z370 Chipset Family USB 3.0 xHCI Controller
00:16.0 Communication controller: Intel Corporation 200 Series PCH CSME HECI #1
00:17.0 SATA controller: Intel Corporation 200 Series PCH SATA controller [AHCI mode]
00:1c.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #5 (rev f0)
00:1c.7 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #8 (rev f0)
00:1d.0 PCI bridge: Intel Corporation 200 Series PCH PCI Express Root Port #11 (rev f0)
00:1f.0 ISA bridge: Intel Corporation Device a2ca
00:1f.2 Memory controller: Intel Corporation 200 Series/Z370 Chipset Family Power Management Controller
00:1f.3 Audio device: Intel Corporation 200 Series PCH HD Audio
00:1f.4 SMBus: Intel Corporation 200 Series/Z370 Chipset Family SMBus Controller
01:00.0 VGA compatible controller: NVIDIA Corporation GP108 [GeForce GT 1030] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GP108 High Definition Audio Controller (rev a1)
03:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 15)```
```root@truenas:/# sas2flash -list
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved
No LSI SAS adapters found! Limited Command Set Available!
ERROR: Command Not allowed without an adapter!
ERROR: Couldn't Create Command -list
Exiting Program.```
```root@truenas:/# sas3flash -list
Avago Technologies SAS3 Flash Utility
Version 16.00.00.00 (2017.05.02)
Copyright 2008-2017 Avago Technologies. All rights reserved.
No Avago SAS adapters found! Limited Command Set Available!
ERROR: Command Not allowed without an adapter!
ERROR: Couldn't Create Command -list
Exiting Program.```
```root@truenas:/# sudo zpool status -v
pool: boot-pool
state: ONLINE
status: One or more features are enabled on the pool despite not being
requested by the 'compatibility' property.
action: Consider setting 'compatibility' to an appropriate value, or
adding needed features to the relevant file in
/etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d.
scan: scrub repaired 0B in 00:00:26 with 0 errors on Fri Nov 8 03:45:27 2024
config:
NAME STATE READ WRITE CKSUM
boot-pool ONLINE 0 0 0
ata-EMTEC_X250_256GB_A2205CW03290-part2 ONLINE 0 0 0
errors: No known data errors```
```root@truenas:/# sudo zpool import
pool: data
id: 15103091714514370022
state: FAULTED
status: The pool metadata is corrupted.
action: The pool cannot be imported due to damaged devices or data.
The pool may be active on another system, but can be imported using
the '-f' flag.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-72
config:
data FAULTED corrupted data
b25df2bc-ce76-11ee-9fb8-d45d64208664 ONLINE
b24ce596-ce76-11ee-9fb8-d45d64208664 ONLINE```
I don't know how serious it is
Bare metal installation?
As reported, the pool ‘data’ is a mirror stripe (RAID0 equivalent), and you can try to import it with sudo zpool import -f data
But the condition is potentially serious.
IMO you should not yet try to import it with the -f flag - we need to be a little more thoughtful about recovery before doing something that might make the data corruption worse.
The pool data appears to me to be a stripe and NOT a mirror or RAID - and if this is the case you do not have ANY redundancy for your data and any error on either of the drives loses you the data on both of them.
The pool data consists of only 2 drives and not 3 as you told us. The UUIDs in the zpool import point us to the WDC WD5000AAKX 500GB drive and the Hitachi HDS721050CLA362 500GB drive. The TOSHIBA MQ01ABD100 1TB drive does not appear to be in use.
Can you please run the cli -c "system version" command that I added earlier and report the results.
Can you please run sudo zpool import -F -n data which tells us what will happen if we try to run a recovery mode import without actually doing it. NOTE: this has a capital -F whereas @etorix’s was with a small -f and these will attempt very different types of imports.
Do you have any idea how much data you had on this pool? If we mount the pool read-only, do you have anywhere to copy your data to?
My thoughts:
IMO you have a non-redundant striped 1TB pool. We should probably try to get you to a situation where you have a redundant pool instead as you have enough drives to achieve a 1TB redundant pool.
Dependent on what we can do to get this pool mounted, there are various routes to getting to a redundant pool, but it will depend on whether you have disk space elsewhere that you can copy the data to. But worse case we seem to have a spare 1TB drive we can use temporarily to copy the data to.
IMO we should first try to mount it read-only and take a backup copy of the data. Then we can try to do a recovery mode import of the pool which will try to roll back the ZFS transactions to an earlier point in time where the pool wasn’t corrupt.
TrueNAS is intended to make pools from similarly sized drives, and not drives of different sizes. Assuming we move the data off somewhere else, we seem to have a choice of using the 2x500GB drives and the 1x 1TB drive either to create a standard 3x 500GB RAIDZ1, or try to create 2x 500GB mirrors, using the 1TB drive split in two as a mirror for each of the 500GB drives (which is not really supported) - either way we end up with giving 1TB of useable space. (If we can’t move the data off elsewhere, we can use the 1TB drive as a temporary staging post, create a degraded RAIDZ1 on the 2x 500GB drives and once we have moved the data back, we can then add and resilver the 1TB drive as a 3rd drive to bring the pool back to being non-degraded and redundant.
P.S. As far as I can tell, all drives are CMR and not SMR which is a bonus.
I don’t understand, I remember creating the pool and selecting the raid5 option in the “layout” dropdown, I don’t understand anything, I’m very sorry, but I’m sure it used the 3 disks, it had a 1.69TB disk in total.
I connected a 1TB ssd that I had at home because I read that a guy recovered the data by connecting an extra disk, but it doesn’t seem to be my case.
What you say about mounting the pool in read-only and copying the data to other disks I think is the best option right now, I can make space on my computer’s disks and copy everything important there
Ok. If you think you used 2x 500GB and 1x 1TB to make a pool which was > 1TB in size, then it had to have been a stripe. Indeed 1.69TiB (i.e. TiB = 2^40 bytes) is approximately 2TB (TB = 10^12). So this would be a stripe across all three drives.
(Note: You could not have selected RAID5 in a dropdown because that is not an option in TrueNAS.)
So now we have a much more serious problem i.e. that the pool is broken because one of the stripes is somehow no longer part of the pool.
Just adding the Portable SSD would NOT have caused this - regardless of what you read, adding a new 1TB drive to the system would do nothing by itself - you would have to tell ZFS to replace one drive with another using the UI or a shell command.
Unfortunately I have absolutely no idea how a drive might spontaneously become detached from a pool, nor how to reattach it once it has happened. Whilst we can still try to mount the pool with increasingly desperate and unlikely commands, I fear that the pool is irretrievably lost and (unless you pay a data recovery company a huge amount of money) your data is gone forever.
At this point I have reached the limit of my ZFS knowledge - if there is anyone more expert who might know how to recover the detached drive and recover pool to the point that you can even read the data please chip in now.
If you decide that your data has indeed gone, and you want to create a new pool from the existing disks, then do come back for more advice on how to achieve this, (and more importantly how to ensure that your data is safeguarded in the future).
We’ll need @HoneyBadger here… If it were a stripe of 3 drives, ZFS woudl report a missing drive.
Please provide the output of sudo zdb -l /dev/sda1 sudo zdb -l /dev/sdb1 sudo zdb -l /dev/sdd1
so we can see what’s in there.
The data does indeed look like the data is still there, and a very expensive data recovery firm can probably retrieve it for you, but if you want to avoid that and get the data yourself then as far as my own knowledge goes there is no way to reattach the disk to the pool and then import it, nor to import a pool which now only has 2 of the 3 drives attached.
I am doing a little research through, and here are a few more commands to try (but I have no idea whether I have these right or not):
sudo zpool history data | tail -n 50
sudo zdb -e data
sudo zdb -eC data
sudo zdb -eh -AAA data | tail -n 50
As far as I can tell, these are readonly and will not make things worse.
Ok - so this tells us that there is a disk missing - which we already knew, but what it suggests to me is that ZFS actually knows that there is a drive missing and that it is not just a 2-drive pool, and so if we can recreate the data for the missing hard drive here - no idea how - then the pool might come back to life.
P.S. It would sure be helpful if the zpool import gave us something like this when ZFS knows that a device is missing:
data FAULTED corrupted data
b2639d11-ce76-11ee-9fb8-d45d64208664 MISSING
b25df2bc-ce76-11ee-9fb8-d45d64208664 ONLINE
b24ce596-ce76-11ee-9fb8-d45d64208664 ONLINE
the /dev/disk/by-id/ i have to change it to for exemple /dev/disk/by-partuuid/b25df2bc-ce76-11ee-9fb8-d45d64208664? sorry for the silly question but i dont wanna execute something and break everything
You are right to be cautious - let’s try the following first (which should only tell us what can be imported and shouldn’t try to do an actual import):
sudo zpool import -d /dev/disk/by-id/
sudo zpool import -d /dev/disk/by-partuuid/
Let’s see what both of these say and then we can decide which we can use to try an actual import.