Pool degraded with very few usage

Please, be careful with that! It is in a RAIDZ1 config!
Doing this might ruin all his data!

My first suggestion is to get physical acces to the HDD when it is powered and shows missing. Put your hand on it and check, if the disk is rotating at all.
If not, check the sata power cable too.

ok, i’m ready with a new out of the box sata cable.
Waiting that Nas complete the shutting down.
Yeha the disk was rotating, the vibration regular and temperature too. Like other disks.
I turn back after cable replacment and power on. Hope to see the disk3 live again.

nothing… the disk 3 is missing from pool also after cable replacement, but it spin on and rotate like other, no different sound for now.

Then I would replace that disk.
Even, if it is totally OK one.
2TB HDD-s are so cheap to buy.
Also, can you detail, what is between the MoBo and the HDDs?
Do you use the motherboard sATA channels or an HBA?

:scream: noo… uff… here cost like 80€ … not properly cheap but its ok…
The motherboard has 6 sata port on board. So there are 6 direct sata cable that go to the proper HDDs. The first is the little one with Truenas core OS.

Sorry, i do not understand…
now is the ada5 missing and ada 3 is online?

It’s the same disk - you can see the same GPTID, which is unique. ZFS uses GPTID to track the physical partitions that are part of the storage pool.

So, the conclusion is that this drive failed, Therefore:

  1. Replace it with a new one.
  2. Resilver
  3. Perform long S.M.A.R.T. test, and conveyance if possible.
  4. Perform S.M.A.R.T. tests and scrub on regular time intervals (e.g. monthly).

The short device name (ada5, sda, and so on) can change between boots, it comes down to tiny difference in device timings and how that plays with device enumeration.

Because device names are volatile, it’s always best to double-check that other identifiers, like the GPTID or serial number, is what you expect, before you do anything, like remove a drive or even run tests.

1 Like

why would TrueNAS get confused at boot reading each drive to know what goes where ? Is that a bug ?

Well, I recommend you buy a HBA.
I have an IBM 1015. its like around 50 USD used.
You have to FLASH it to IT mode and then it will be nice.

No, it is totally normal Linux behavior.

That’s a non-answer. Why would TrueNAS get confused ? Why would it be dangerous ?

It is totally normal behaviour for GRUB to rely on drive letters and lose its marbles when a drive is added or removed. So Linux heads expect it to be normal behaviour and get confused when they hear that ZFS does not care about drive reshuffles, nor does the BSD bootloader (or the macOS bootloader whatever that is), nor does… anything smarter than GRUB.

1 Like

hmm, in my head is WTF ?!.
Ok, say Linux don’t like the shuffle. Then TrueNAS takes over and handles the rest. TrueNAS will do it’s ZFS thing and the only problem would be booting due to Linux booting thing.
None of that should brake a pool. Or am I wrong ?
What would be the procedure ?

Side note: In a proxmox box, I bypassed the HBA and loaded Scale that totally destroyed the pool. Hence I loaded CORE and it behaved as expected. Pulling a drive out and back in wasn’t an issue.

Three out of five disks in your pool (the Seagate ones) are SMR (Shingled) - the “disconnecting and resilvering” behavior is typical of an SMR disk under load that is “re-shingling” itself.

Unfortunately the best solution here is also rather expensive - you’d need to replace those Seagate drives with “CMR” or “Conventional” drives, one at a time.

4 Likes

It’s possibly cheaper to build a new pool with 2-3 larger drives.

1 Like

ZFS might.
If you read the ZFS manual, it is written there, that the normal /dev/sdx method is not the best, and it is recommended to use UUID instaed to avoid any future issues during boot.

…I agree with that.

Then again, say my box broke, just died. Will those drives be understood in a new box/PC ?
I remember in CORE it was not a problem, if that is a problem in SCALE …I guess it’d take tweaking the kernel and have a compilation just for TrueNAS to attend to this. Because that is one ugly caveat.

Yes, they will be. You have to import the pool through the webinterface.
(and NOT form CLi). if you add all the HDDs. then under “Storage” the pool should appear as importable one.