OpenZFS 2.3, UX Engineer Damian Szidiropulosz, and Gaming on TrueNAS? | TrueNAS Tech Talk (T3) E011

HoneyBadger · January 18, 2025, 12:03am

On today’s episode of TrueNAS Tech Talk, Kris and Chris welcome their first guest, TrueNAS UX Engineer Damian Szidiropulosz. OpenZFS 2.3 is released, Damian dishes about building on UI and UX design feedback from the Community, and we dig into GPU passthrough for running games directly on your TrueNAS system, even VR titles!

winnielinnie · January 18, 2025, 5:08pm

@HoneyBadger, starting at the 3:42 mark, you said that the new “direct IO” feature (i.e, direct dataset property) will read directly from disk, skipping the “read to memory” portion, and then copy the data to the ARC, later on the side.

From what I understand, isn’t this new feature meant to bypass the ARC? It skips the ARC, and instead reads directly from the drive into working non-ARC memory, to be used for whatever usertools, applications, or network transfers. (It skips the part of adding to the ARC the blocks that were just pulled from disk.)

In other words, it favors skipping the ARC for new writes (and reads) if the feature is enabled (and whenever it is possible to do so).

I believe one of the rationales for this new feature is not just “increased performance” (due to eliminating the overhead of writing/pruning entries in the ARC), but also to spare the ARC from unnecessary pressure, since loading data directly from a fast NVMe is almost “just as good” as pulling it from the ARC itself. Thus, there’s no need to also copy recently pulled blocks^[1] of data into the ARC or add pressure to the ARC, because it’s better to reserve the ARC for data that will be read from HDDs.

I might (likely) be wrong.

EDIT: To give a scenario.

Let’s say a user has an NVMe pool with a multimedia dataset. They set this dataset’s property to direct=always.

Their rationale for this is not strictly for “better performance”, since they might not notice much of a difference when reading a 4-GiB video file over SMB. They actually don’t want large chunks of data (even if frequently requested from the NVMe storage) to have anything to do with the ARC.

“Hey, ZFS. Leave my ARC alone, so that it can be focused only with requests from the HDD pool. With my NVMe dataset(s), don’t bother determining whether or not these blocks of data should ever be copied into the ARC, no matter how often they are requested. Only my HDD pool will really benefit from that, and I need as little pressure as possible triggered from read/write requests of my NVMe multimedia dataset.”

EDIT 2: It’s still not clear how this effects metadata.

Even for NVMe, I would still want metadata loaded and held into the ARC, even if I set direct=always.

Recently pulled blocks of data from a “very fast” device, such as NVMe. ↩︎

HoneyBadger · January 18, 2025, 9:08pm

Actually, I think I was wrong here. Aligned IO from a dataset with direct in one of the two enabled states (standard or always) will bypass the ARC and won’t end up copied there. Misaligned or undersized reads will still go through the regular buffered/ARC path; it’s not going to EIO or otherwise fail if you try to read 16K of a 128K record, for example, the whole larger record will be buffered into ARC and fed from there for any subsequent reads.

Aligned writes that update a buffered record in ARC will also eject that stale record from ARC and write direct to disk; so there’s the potential to still gain some wins in one direction if you’re writing aligned but doing partial reads.

Hey, ZFS. Leave my ARC alone

“All in all, it’s just a-nother noop’d syscall?”

I think metadata being stored separately by the SPA means it would always be a buffered read/write. Worth a quick test if I can get my hands on a bunch of NVMe … now, how to distract the Platform team so I can, uh, “borrow” that lab F-Series?