ELI5: Can you describe me a use-case of hot-spares?

I recently looked into the great resource post about dRAID from jro.

That made me realize, that I am not sure if I understand the actual use case of hot-spares.

First of all, if you have access to the system, maybe because it is in your home or in your office, there is no actual use case for this, right? I could just get an email alert and insert a new disk by hand.

But let’s say the server is in a colocation and they only do HDDs swaps once a week. Now one advantage I could see of hot-spare, would be that the resilver starts immediately.

But then I ask myself

  • For a single RAIDZ2 vdev, would it not be better to just move to RAIDZ3 instead of using the additional disk as hot-spare? Same storage, same amount of disks, but more parity?

  • And if you have multiple vdevs, the situation gets even worse, because now you waste multiple hot-spares. Would it not be better to just have the drive unassigned and after I get the warning email to manually replace (remote in webGUI) the affected disk with said cold-spare?

Now with dRAID it seems like a lot of things change.

Hot-spares are not empty sitting around like they use to, but they are “virtual hot-spares”. The empty space is shuffled around disks?

So instead of having multiple hot-spares for each vdev, you can have them now for the whole pool?

But even then, would it not be better in many situations, to use these drives as parity instead of hot-spares?

There is one use-case I was able to come up with, but I pretty sure it is flawed.
I wan’t dRAID over RAIDZ for the faster resilver.
My workload is mostly 16k, that is why I want 4d.
I would create a:
draid2:4d:48c:0s

Now instead of going to get 8 additional drives and go with
draid3:4d:56c:0s

I could only buy 2 drives and go with
draid2:4d:50c:2s

But then again, going for draid3 would be much better according to the calculator.
But if my case only allows for 50 drives, using draid2:4d:50c:2s instead of draid2:4d:48c:0s would be a little bit saver. This is the only use case I can come up with. What is it I don’t understand?

Yes.

But hot spares are per pool, so for multiple vdevs, having one or two hot spares catering for >2 raidz2 vdevs may make sense.

dRAID integrates “hot spares” as part of the data-holding drives to speed up resilver BUT it is even less flexible thand raidz#, as now expansion would be by adding another full dRAID shelf. dRAID only makes sense if you have tens (plural!) of drives and are considering a raidz# layout with spares.

3 Likes

Ahh, I thought it was per vdev and not per pool! Thanks for the clarification!

1 Like

I run hotspares. As mentioned, definitely per pool.

I also have 10 mirrors striped. When a mirror degrades, there’s a total failure of redundancy, and I want that spare to jump in straight away.

Why not triple mirrors? Budget

Why not raidz2? Performance :wink:

This system will do >10gbps

1 Like

Also, Hot Spares can be sort of shared between pools. You can add the same Hot Spare to multiple, imported pools, (on the same server obviously). The first pool wanting to use a Hot Spare gets it.

There are probably pool export problems when a Hot Spare is in use. But likely not serious ones.

Last, while a large HDD Hot Spare for data vDevs using HDDs makes sense, a Hot Spare can still activate for a Special vDev. For example, a 8 HDD RAID-Z2 with 3 way Mirror Special vDev using SSDs, can certainly benefit from a HDD Hot Spare.

Remember, loss of a Special vDev, (regardless if based on SSD or HDD), means loss of the entire pool. So a slow, huge HDD stepping in for a smaller, faster but failed SSD in a Special vDev also makes sense. Especially if standard redundancy was not followed, (aka RAID-Z2 data vDev(s), but a simple 2 way Mirror for the Special vDev).

3 Likes

Did not know that.

How do you do it? CLI? I would’ve expected the GUI to not allow selecting the disk once it’s a member of a pool.

I’ll just say, it is my opinion that hot spares don’t seem worth it for most home users, because of multiple reasons:

  1. You’re already physically present around/near your server, so if a drive needs to be replaced or swapped, you can do it manually yourself.
  2. You won’t need to use up one (or more) ports, simply for drives that will see very little use in terms of reading/writing data.
  3. Less complexity in terms of pool management, exporting, moving to a different server, etc.

My alternative, instead of “hot” spares:

I have two 8 TiB Red Plus drives that I purchased when they were on sale, which I’ve run badblocks and SMART selftests. I will occasionally spin them up in a USB bay just to give them “exercise” and run long selftests every few months or so. If I ever need to replace a drive in my pool, I’ll have one ready.

2 Likes

Granted.

In fact, I don’t even use a cold spare on my home raidz2.

If a drive fails, I order one, burn it in, then replace.

(And double check backups are up2date)

1 Like

You might as well pair them and use as an offline backup :wink:

That’s what my secret agent requirements are for. :wink: Of course, the Jonsbo N1 is no longer available. Just my luck.

EDIT: Just checked, and it’s available again? But the price seems higher than I remember.

1 Like

Thank you guys! That helped a lot.

Only thing I still don’t get is hot-spares in dRAID.
If the hot-spare is not really a hot-spare anyway but a virtual one, why not just use it as a parity drive to begin with?

There is one use-case I was able to come up with, but I pretty sure it is flawed.
I wan’t dRAID over RAIDZ for the faster resilver.
My workload is mostly 16k, that is why I want 4d.
I would create a:
draid2:4d:48c:0s

Now instead of going to get 8 additional drives and go with
draid3:4d:56c:0s

I could only buy 2 drives and go with
draid2:4d:50c:2s

But then again, going for draid3 would be much better according to the calculator.
But if my case only allows for 50 drives, using draid2:4d:50c:2s instead of draid2:4d:48c:0s would be a little bit saver. This is the only use case I can come up with.

You can’t add more than 3 [parity drives].

2 Likes

I almost spit my coffee on my laptop, imagining you serving me this sentence ice cold :joy:

Thanks a lot, I totally overlooked that!

1 Like

I largely agree with your point, however sometimes I’m on vacation etc.

I already had one drive more than I needed so I use it as a hotspare for my main system at home.

For my remote system (single 8TB mirror) I don’t even have a cold spare, because as long as my main system is operational with a hotspare in place I think I have the time to RMA / reorder a drive and burn it in.

It’s unclear what path I will choose in a few years when my storage needs expand, if both machines use 8TB drives at that point.

However after all the money spend on equipment and drives and I think I can justify to buy one more drive to be absolutely sure I would wake up to a resilvered pool after a drive failure. If I don’t get you wrong you would also still advocate for a cold spare. Although electricity isn’t that cheap around here, the 20-25 bucks a year for a hot spare vs a cold spare is something I can live with.

I also run striped mirrors, with RAIDZ2 I’d be content with a cold spare.

1 Like

I don’t know if the GUI limits Hot Spare usage to 1 pool. Have not needed / bothered looking. But, the CLI does not limit usage. Here is what the manual page for zpoolconcepts has to say;

Spares can be shared across multiple pools, and can be added with the zpool add command and removed with the zpool remove command. Once a spare replacement is initiated, a new spare vdev is created within the configuration that will remain there until the original device is replaced. At this point, the hot spare becomes available again, if another device fails.

If a pool has a shared spare that is currently being used, the pool cannot be exported, since other pools may use this shared spare, which may lead to potential data corruption.

The last paragraph points out a problem, (not serious), of using shared Hot Spare(s). When I mean “not serious”, it is trivial to remove the in-use Hot Spare from other pools not actively using it, before exporting a pool. Yes, it’s manual intervention, but easy enough.

1 Like

One option instead of a Hot Spare for RAID-Zx or Mirror pools, is to have a warm spare. This is a disk that is installed, (spinning if HDD), and available if the need arises. (Plus, passed the various SMART tests and bad block tests.)

The advantage of a “warm spare” is that you can have it as an additional backup pool. Perhaps not for all your data. But, lots of people have critical data that can fit on to a single, large disk.

Of course, this extra single disk pool should not have any shares on it. And should definitely NOT considered your only backup. (Because on any pool disk failure, you destroy this pool and use the disk for replacement.)

2 Likes

On the topic of “In a home environment you will be present to fix the system”

Keep in mind this golden rule. Any time you leave something up to human intervention you’re opening yourself up to human error. Just food for thought.

There absolutely a reason why most enterprise solutions have a plethora of hot spares in a storage solution regardless if they’re in the customer office location, co-location, etc

1 Like