Special Metadata VDEV

Hello Guys,

So, i’m on my way to create a new NAS with Seagate 16x16TB EXOS drives in RAID-Z2 (8 disk in each VDEV). The disks will be connected via LSI 9400-16i flashed to IT mode.

This machine can fit some SSDs and i wanted to check with you guys if it would be worth to install the metadata VDEV for faster directory traversing and search. I’ve been thinking of it as its all HDD based pool. Probably, an Intel 900/905P U.2 SSDs. But, i’m not sure what capacity do i need for my setup.

I’m aware that if a metadrive is dead or is lost, the whole pool along with the data is lost so i plan to use 4 drives in mirror (2 drives each in mirror).

I’m not sure what drive would best fit. Any suggestions on this are welcome!

Thanks

What are you storing?
A safer bet might be L2ARC (Metadata Only) which is not pool critical

1 Like

This will be basically archival for my raw video footage which i record in FHD and the other work is the final rendered output. Then, it has the libraries and other financial information which gets stored on a weekly/monthly basis and accessed only when needed. Sometimes, i’ve to write a few terabytes of data on this NAS. Currently, the previous one was 5xRAID-Z1 and i’m just building a new one for more capacity as the previous one served for like roughly 3yrs and i need more.

Is L2ARC and L2ARC metadata two different things. Last year, i did some experiments with the spare parts i had and i kinda liked the speed by adding the metadata drive (SK Hynix PC801 SSD, if i’m not wrong). It was a test pool though. I need to re-test it.

Also, can you tell me if adding metadata is beneficial to HDD only pool or SSD/NVMe also?

Special VDEV (sVDEV) Planning, Sizing, and Considerations

It definitely is. Added a special vdev to my 8 wide RAIDZ2 and it improved by a lot.

a mirror is perfectly safe.

But I would use only two SSDs from different vendors as special vdev or one 3 way mirror.
4 seems like overkill.

Will have a read. Thanks!

I thought so. Do you also have improvement in transfer speeds or just the directory traversing?

Yes, is fine for non critical builds.

Hmm. I read that everyone recommending to use same capacity and brand drives. Is special vdev and exception here?

I can setup that to have extra redundancy and peace of mind :blush:

Transfer speeds for my needs (big sequential reads and writes) were unaffected.
But listing 20k files improved a lot.

I never saw that recommendation anywhere.
Reson for using two different brands is that chances of both failing or having a bug is smaller.
Remember the Samsung TBW firmware bug?
Or the Samsung overheating and instead of slowdown, shutdown?
Or the Patriot drives with broken firmware for the Phison controller that lied about sync writes?

Yeah, that is why even for RAIDZ I would recommend if possible and at a decent price to get multiple vendors. Just in case you get a bad batch. You could argue that mirrors are because of that even safer than RAIDZ2, but that is a little bit off topic.

But you don’t have more redundancy. You have more storage.
You get more redundancy by using 3 way mirror instead of 2 way mirrors.

[My View]
Metadata on SSD Can be beneficial - the larger the metadata - the more beneficial it is.
If you have folders with only a few files/folders I wouldn’t bother. If you have folders with 100,000+ files/folders then I would.

I do as I have one set of folders with 250,000+ sub folders and a half million almost exclusively small files. So I also use the small files part of a special metadata vdev to put the small files onto SSD. This is a significant improvement and fundamentally the only reason I use a special metadata vdev

On my backup server I use an L2ARC in metadata only mode to cache the metadata without being pool critical.

Thought so!

Oh, yeah. Don’t ask about Samsung. Left using it a couple of years ago due to overheating and firmware issues.

You have a point here. Will have a read.

4 disks (2 VDEVS, mirrored)

Umm, yes.

So, what drive are you using for the metadata VDEV?

Cool cool. What SSD do you use for metadata VDEV?

I’ve heard a lot about the L2ARC metadata which is non-critical. How to setup that and how is the performance between a real metadata VDEV vs L2ARC metadata?

It wasn’t my intention to trash Samsung, these were just the incidents I knew off.
Others are also bad. And it is not like there are a lot of them if you don’t count the ones that just use a Phison controller.
My conclusion is that, all vendors can be bad, that is why you need different vendors/controllers :grinning:

Yeah. That will get you the capacity of two disks.
But if two disks in the same vdev fail, your pool is gone.
If you use 3 disk in one single 3 way mirror vdev, you get the capacity of one disk (which should be more then enough for metadata, even if you have small 128GB SSDs) but now two drives can fail without you loosing the pool.

That is why instead of 4 SSDs, I personally would use 3 SSDs but with 3 way mirror instead of 2 way mirror.

The trashiest, baddest SSD you can find :smile:
No seriously, there are basically no requirements for special vdev.
You don’t write much data so you can use even trash like QLC, you don’t write a lot of data so you don’t care about TBW.

Nobody will be able to answer you that, because it depends.
L2ARC can only serve hot data that evicted ARC before.
L2ARC also can’t store metadata writes.
The pool not having to do a metadata write thanks to special vdev could even help for read performance, because the HDDs are not distracted by that task.

I’ve had some bad experience with Samsung and no longer use them.

Other than Samsung i know is Sabrent. Phison not good?

Yea, true that.

Hmm. Can you please explain it more? I’ve always a confusion regarding understanding X way ;(

What do you mean by 3 way? Sorry to say, but i’ve always confusion regarding the understanding of X way mirror or VDEV.

Gosh. The bad will not even have PLP. Is PLP a requirement for special VDEV?

Yes, but someone said that L2ARC (metadata) and i’ve heard it before as well.

mirror is two drives in a mirror.
3 way mirror is a mirror over 3 drives. All 3 drives have the same data.

Gotcha. So, maybe i’ll go with 4 way mirror.

That will give you awesome read performance of metadata, the capacity of one single drive, and you can loose 3 disks :+1:

1 Like

Yes, i hope so. Sounds to be the safest. If all 3 disks fail in a row, i can replace it real quick and resilver.

Now, a question, if a disk dies in Special VDEV (consider its X-way mirror), then how to do the resilver. I’m not much worried about the resilvering cause being an SSD, it would be way faster than HDD so less stress on the left drive(s).

Another question is how can i see the details of the Special VDEVs or the disks in the special VDEVs?

Sure. Still think than more than 3 way mirror is overkill.

I don’t understand the question. You just resilver.

That is a little bit annoying currently since there are no GUI options.
zpool list -v will at least show you how much storage is used.

Your main pool is raidz2, correct? Why have a higher redundancy on the special VDEV?

In that case you should also set raidz to raidz3 on the main pool. In any case please keep in main that RAID (and the corresponding redundancy features in ZFS) are not a backup, rather they are there to improve availabilty of the pool. So you’d still need a backup (including an offsite backup) to cover other threat vectors (fire, flood, theft, etc.).

There shouldn’t be very little stress on the remaining SSDs during the resilver, because no moving parts and also no wear, since they would only serve read requests.

Would having enough RAM (in conjunction with a high-enough recordsize, say 1MiB) also be potentially sufficient to store meta data? If so, it might be worth starting with that and then add the L2ARC (Metadata only) drive later, if still needed.

2 Likes

Yes, me a bit too but i’m also thinking more on the pool safety.

What i meant to say is, if a disk in the metadata pool dies, how do i replace and resilver? Just like a regular disk in the data VDEV?

This SO MUCH SAD really. How about the disk health, temps and all that stuff?