ZFS cache idea - would it even be possible?

I’ve just had an idea.
It’s probably not a very good one but I still wanted to share it and ask if it would even be possible or remotely useful.
I know it most likely won’t be implemented by anyone.

In my NAS I have a RAIDZ1 consisting of 4 HDDs and an SSD mirror.

The SSD mirror is less than 50% full and mostly used for app configs, VM disks and similar stuff.

The HDD RAIDZ1 has no SSD caches of any kind.
Now my question is: would it be technically possible to use the free part of the SSD mirror as a dynamically sized cache (probably only L2ARC, maybe also SLOG depending on the use case) for the HDD mirror?
My idea would basically be a hidden zvol with a dynamic size that gets added to the HDD pool as L2ARC and could grow if space becomes available and shrink if it gets used.

The space would still need to be viewed as “freeish” by ZFS for writes to continue (and still be fast) of course.
With this you could have two independent pools and still get a nice performance boost for the slower from the faster one.

I know - it would be pretty complicated, it would mix pools and it wouldn’t probably be very interesting to enterprise customers who would maybe rather spec their machines with a few more SSDs as cache but I still kind of liked the idea.

Could anyone here tell me if it’s even remotely plausible?:sweat_smile: Maybe @HoneyBadger or @Arwen ?:sweat_smile:

With ZFS as it currently exists? No.

It would be technically possible, though strongly discouraged, to partition the SSDs, use one partition of each for your data (whatever you’re currently using it for), and the other for cache. But to have that cache dynamically sized isn’t possible AFAIK.

Your wording was a bit confusing, as you said 4 HDD in RAID-Z1 then later say HDD Mirror. I will assume you meant HDD RAID-Z1.

Layering a zVol on top of a SSD Mirror as a RAID-Z1’s L2ARC / Cache device is an odd way to do things. L2ARC / Cache devices are not normally capable of being Mirrored or RAIDed. Just partially striped if you have more than 1. By partially striped I mean that a single entry in the L2ARC / Cache exists in only one device, but both are used similar to round robin.

Using a zVol to Mirror a L2ARC / Cache simply adds 2 layers of overhead that were not in the original design of the L2ARC / Cache vDev.

Lots of people want better performance and think either SLOG or L2ARC / Cache devices are the way to go. Unfortunately, ZFS was designed more with data integrity in mind, than absolute performance.


Now what ZFS really needs, is an async Write Back Cache. I wrote a long and complicated, but reasonably complete idea for such. However, I am in no position to implement nor encourage others to implement such an idea.

To sum up:

  • Assume RAID-Zx and or HDD pool
  • Have a SSD or NVMe special allocation vDev available
  • Assign both a Quota and Reservation for the Write Back Cache on the special vDev
  • When writes pile up too fast, allow them to be temporarily assigned to the Write Back Cache, for later flushing.

Now obviously this does not solve all problems. But, when someone is asynchronously writing within the Write Back Cache size, but larger than memory size, this is a win.

I know. I rather meant this as a highly theretical future ZFS feature.

You assumed correctly, I just mixed things up while writing.

Of course, didn’t think about the overhead that this would come with.

That’s a really interesting idea:D

1 Like

If the cache hit rate of the ARC is above 99% which is the case for many installations, an L2ARC does not improve anything with respect to performance. It can even hurt, because less memory is available for (L1) ARC - L2ARC needs memory for its management.

And even if it works, L2ARC is not (!) a hierarchical storage management. ZFS will not place “hot” files on the SSD and “cold” ones on the HDD vdevs, respectively.

Anything resembling HSM does not exist in ZFS. The thing that comes closest (IMHO) is a metadata and small blocks special device or a metadata only L2ARC.

2 Likes