Understanding the SLOG function with needed recommendations

etorix · April 16, 2024, 11:34am

Mostly, yes.
Sync + SLOG is still closer to sync than to async in terms of performance.
ZIL is built-in: It is a stripe of little areas in each drive of the pool. A SLOG is a dedicated device serving ZIL purpose; it is optional—and only makes sense if the SLOG is faster than the pool. If your pool is made of data centre-grade SSD with PLP, even Optane may not make sense as a SLOG over the built-in ZIL.
Mirrorred SLOG depends how paranoid you are, really. You correctly understand that a SLOG is only ever read in the event of an unclean shutdown, so the scenario for data loss is an unclean shutdown or crash AND the SLOG not coming back up on reboot.
You need a SLOG per pool (not per vdev), but if you do have two pools, and want a mirrorred SLOG for each, it would be possible (though not supported by the GUI) to partition two Optane drives and stripe a partition from each to make two mirrored SLOGs. Optane has enough IOPS to tolerate double duty.
I suppose the “industry standard” would be Optane DC P4800X/4801X/5800X over the consumer variant 900p/905p. And a Radian RMS should be faster than Optane, by virtue of being genuine RAM, backed by battery.
Pool capacity is irrelevant. ZFS will cache at most two “transaction groups”; default txg is 5 s, so that is 10 seconds worth of transactions. 10 s * 10 Gb/s = 100 Gbits = 12.5 GBytes (but more like around 10 GB because not every bit will make a data byte)
With a fast SSD pool behind, a 8 GB RMS-200 is possibly enough but you’d want a 16 GB RMS-300 to be sure that SLOG capacity is never going to be a throttle.

Some more reading: