I saw statements like “if the ARC hit ratio is above 90%, there is no need for L2ARC” in multiple sources (1, 2).
My average ratio is very close to the 90-100 range. However, looking at the report page, it can drop to something like 50-% during the load (backups, but I’m asking for the general case).
My questions are:
Does this rule of thumb apply to the general hit ratio or to the hit ratio during a particular load?
Could the performance (during the high load) be improved with L2ARC?
Does a hit ratio during the nonpresent or low activity even mean that much after all? I mean, it makes sense in the enterprise environment with a nearly 24/7 load. But for home use? Not relevant.
I’d argue it has more impact on your experience for the general use case. So I’d focus on ARC stats that are representative of your user experience, ie what do you care about? For example, I care about how long it takes to backup to my DAS, so a L2ARC or sVDEV made a huge difference for me while using rsync. My other users don’t notice the difference.
Depends really on what your use case is. For example, your NAS already has a sVDEV, so the use of a L2ARC should be close to zero unless said L2ARC configured to ignore metadata AND you have a bunch of read-only files that are not stored in the sVDEV. That use case may exist (Plex catalog?) but I have no experience with same.
Really depends on what your system is trying to do, no? If the system is idle but a task takes a long time, who cares? At the same time, you may want certain tasks executed in idle times and be done by the time high use hours start, so even idle time may be bounded by other use considerations.
I gave it another thought. Question 3 is not relevant, as the average ratio is calculated from all hits and misses, and there are not many hits or misses when idle. So average hit ratio is a useful characteristic despite the lower ratio during the loads.
The devil is in the details and I always suggest testing under the use case that matters to the end user.
For example, my NAS’ arc stats looked perfectly fine but the addition of a 512GB metadata-only, persistent L2ARC to a 32GB RAM C2xxx system sped up rsync operations by 12x once I ran three rsync operations in a row and got the cache “hot”.
The key is to figure out how to test the system under “use case conditions”. That’s not always easy. But I would try to test objectively rather than rely on stats that may not be reflective of impacts on work flows that matter to the end user.