Doubt with how ZFS RAIDs work

Hello there TrueNAS Community. I’m new with the use of TrueNAS and how its RAID system works.

I want to test the alerts of the system when a drive goes down and the RAID that has that drive gets DEGRADED. I noticed a feature that allows me to turn off a drive from the web UI. The theory says me that this should work to test the alert. However I have a doubt if I set the drive back online again, the RAID will re-build automatically, or if I will have to configure it again.

I know other RAID systems that they re-build as soon as they detect another compatible drive to make the RAID again, but I don’t know if ZFS RAIDs work the same or it’s something different.

I’ll be grateful if someone can clarify this to me. Thank you for reading!

I don’t think there is such a feature. Are you talking about taking it offline, perhaps?

Yes! That one. If I put a drive offline and set it up online again, the RAID will re-build by itself?? Or I’ll have to configure it manually?

As far as I understand it, you need to tell TrueNAS that you are replacing the disk, and in case there is already something on the new disk, tell it to discard that data. See Replacing Disks.

A tip: when I considered moving to TrueNAS, I’ve created a virtual machine with a few virtual disks to play with it without risking losing any data.

1 Like

Not when you’re putting a disk back online that had been part of the pool; in that case it will resilver automatically.

Thanks for your answer dan! Just to be cautious, I’ll follow the advice of numo68 and test this on a TrueNAS VM first.

Hardware RAID and ZFS RAIDZ works a bit differently.

With hardware RAID, no matter what, as soon has a drive is removed and inserted back into the array, the entire disk is going to be rewritten, even if only a few MB of data is present in the array.

By comparison, ZFS will only focus on how much data is actually used on a disk. So if you have a few MB of data, only the few MB of data required to restore redundancy is written to the drive.

When ZFS sees a disk was removed and added back again, ZFS will first identify if the disk is the missing disk, and if so should start resilvering the pool. I think it might compare the metadata and figure out if there is a need to resilver or it might start resilvering but only recover the missing bits. It won’t touch valid blocks.

ZFS will tell you how much data has been resilvered (actively rewritten) in the process. It could be just a few blocks.

If you decided to replace the drive and selected to wipe the data on the new drive, then resilvering will take care of recreating the redundancy on the drive based on the amount of data in use.

Hi Apollo, thanks for your reply.

In my case, the RAIDs are made of 6 drives. If I’m getting right your explanation, this means that if I turn offline for example the ‘sdf’ drive, the RAID that uses that drive will get DEGRADED, and when I turn online again the exact same drive, the RAID will resilver by itself and because of the RAID has 4 storage drives and 2 spares, I shouldn’t lose data, right?

Please confirm or correct me if I’m wrong.

Hardware RAID and ZFS RAIDZ works a bit differently.

With hardware RAID, no matter what, as soon has a drive is removed and inserted back into the array, the entire disk is going to be rewritten, even if only a few MB of data is present in the array.

By comparison, ZFS will only focus on how much data is actually used on a disk. So if you have a few MB of data, only the few MB of data required to restore redundancy is written to the drive.

When ZFS sees a disk was removed and added back again, ZFS will first identify if the disk is the missing disk, and if so should start resilvering the pool. I think it might compare the metadata and figure out if there is a need to resilver or it might start resilvering but only recover the missing bits. It won’t touch valid blocks.

ZFS will tell you how much data has been resilvered (actively rewritten) in the process. It could be just a few blocks.

If you decided to replace the drive and selected to wipe the data on the new drive, then resilvering will take care of recreating the redundancy on the drive based on the amount of data in use.

You should be able to evaluate the different scenarios based on the different .

If you offline “sdf”, “the RAID that uses that drive will get DEGRADED“ better worded as the pool with the missing “sdf” drive will become degraded.

If your pool is using RAIDZ2 with 6 drives wide, then you get 4 disk with 2 disk as redundancy, but it is not what I described the issue in your case. With RAIDZ2, you will be fine, but the issue is related to importing a drive into an faulted one.

So you need to read my post again and search and learn from the internet. You need to coroborate my findings against the noise.

There is a lot of subtleties/nuances in ZFS dialect you need to be made aware of.