"Best" usage of my 3 types of disks? Also, what's the deal with swap on SCALE?

I have done a lot of reading here, actually just read a thread with somewhat similar hardware to me, but I’m still not really sure what to do here.

Hardware is an iXSystems TrueNAS Mini X+ system (I have physical space constraints and the small but capable chassis was ideal). I’ve got the following:

  • Atom C3758 (8-cores, 2.2GHz)
  • 128GB ECC DDR4 - 4x32GB (self-upgraded from the stock 2x16GB)
  • Factory 250GB WD Blue NVMe (boot)
  • 5x Seagate Exos X18 18TB [ST18000NM000J] - SATA
  • 2x Crucial MX500 4TB [CT4000MX500SSD1] - SATA
  • 1x Radian Memory 8GB RMS-200 in PCIe slot (gift from a friend)

If you’re not familiar with the RMS-200, these are basically RAM drives with a capacitor-backed 8GB flash storage, designed for massive write abuse. The RAM auto-copies to flash in the event of power loss, so the actual flash on the card doesn’t get written much. It presents to the OS as an NVMe drive. The speed is somewhat limited because it’s an x8 card in an x4 slot here, but it’s still really quick.

Planned usage:

  1. MacOS TimeMachine backups from my laptop - It is on WiFi 6E so I expect it to be able to hit ~1Gbps even though it is wireless most of the time
  2. SMB (or NFS??) file access for the Mac and my wife’s Windows laptop - Apple Lossless music stored here from all of my CDs; personal home folders/document storage; planning on ripping BluRays and DVDs to here as well (hence the relatively large amount of storage for a small system)
  3. Some sort of media/video sharing server (likely Jellyfin)? Due to lack of transcoding hardware on the TrueNAS hardware, I may instead just use it as storage space for the media, and run Jellyfin on a different system I have… meaning that the media would likely just be an NFS or SMB mount to a different Linux system.
  4. Backups from one or two Proxmox systems running other VMs (e.g. Jellyfin, maybe a PiHole VM, my virtualized OPNsense router, etc)

I might consider running some smaller TrueNAS “Apps”/utility containers on the TrueNAS system (likely something like sonarr/radarr/etc to yarrr some media), but the lack of transcoding hardware means that a “media server” probably needs to run elsewhere and the NAS will mostly just be a NAS.

I was thinking:

  • RAIDZ2 of the 5 spinners - main pool (Z1 seems unwise with 18TB drives, no?)
  • Mirror of the two SATA SSDs - flash pool - “app” / VM storage?
  • Radian Memory as SLOG for the main pool? But I don’t know how much benefit I’ll get out of the scenarios above since I doubt there are a lot of sync-writes…

TL;DR: I’m mostly concerned with whether I should do Z1 or Z2 on the main spinning pool of 18TB drives, and then how to best utilize the pair of 4TB SATA SSDs and what to do with the 8GB Radian card.

Additionally, what’s the deal with swap on SCALE? I actually created the main pool and flash pool already (no data so I can re-do them), but all of those disks had 2GB partitions created for swap, but then swap isn’t actually turned on. I feel like having swap on the spinners is kind of pointless, so maybe I should re-do the pool so I can reclaim that 10GB of storage? 16-32GB carved off of each of the SATA SSDs would make more sense as swap… I doubt that I need 128GB of swap to match the 128GB of RAM? Chris Down, who has worked on Linux kernel memory management, suggests that Linux machines should have swap…

I had a fairly similar setup.

You have it right re allocation of drives though you could add another SSD and then consider a sVDEV to host metadata, small files. If you set the small file cutoff on a dataset to equal the record size of a dataset then you can force the entire dataset to be held by the sVDEV.

That way, the datasets for your VMs and so on can live in the same pool as your data, you also get the speed boost re small files and metadata, from the sVDEV. That should all work since you have 8 slots, 5 HDDs, 3SSDs. See my resource page re: SVDEV planning and implementation.

The big downside of a sVDEV is that if it goes, so does your pool. Hence my suggestion re: a 3-way mirror for your sVDEV. I use a 4-way mirror sVDEV because I run a Z3 pool here.

I’d likely ditch the radian and keep that slot empty for the future if you ever want to go SFP+. Time Machine does use sync writes but it’s so light re usage I doubt you’ll ever see a benefit unless your VMs are sync-write heavy.

I have 7 slots total, they’re all full. 5x3.5", and 2x2.5". The Radian is internal. I’d prefer to keep anything that may fail accessible from the hotswap bays… The other limitation is that they are all SATA, not SAS.

The system has 2x10Gb RJ45 already; I’m in the process of upgrading my LAN to allow for that too.

Gotcha. The mini XL had 8 slots, IIRC, and some of the SSDs were attached using Velcro?

If you’re at all interested in the sVDEV Route, I’d still consider securing an SSD inside since they throw off little heat and usually do not fail.

Otherwise, go as planned, two SSDs for apps and VMs, 5 HDDs for the pool.

Do a scrub and see how high the temps go, if too high, consider upgrading the fan in the back to something more performant.

I’d still keep the PCIe slot in reserve. You could use it to house NVME or whatever in the future. If you want to attach the Radian, it won’t hurt and SLOGs can be removed w/o issues later but I doubt you will see any benefit unless you start hosting VMs that do a lot of sync writes.

what’s the deal with swap on SCALE?

You did not specify what version of scale.
In the current Dragonfish version of 24.04.1.1 the swap has been disabled though in certain cases the swap partition will still be there at this time. In a future version swap will undergo additional work.

From the release notes:

Fixes to address issues involving ZFS ARC cache and excessive swap usage leading to performance degradation (NAS-128988, NAS-128788).

With these changes swap is disabled by default, vm.swappiness is set to 1, and Multi-Gen LRU is disabled. Additional related development is expected in the upcoming 24.10 major version of TrueNAS SCALE.

With 128GB ram you wouldn’t normally need swap. I have been running without swap since the update came out (update disables swap) without issues and the cache system seems to be behaving just fine on my systems.

The deal with SWAP is this:

  1. The O/S swaps when it runs out of memory for O/s and apps.
  2. In a ZFS system, you need a lot more memory than you need for O/s and apps to hold the ZFS cache, and when more memory is needed the ZFS cache is released before the O/S thinks about swapping. So swapping hardly ever happens. (I have swap enabled, and it has never got above 512MB.)
  3. You really don’t want parts of the O/S or apps swapped out on a performance critical file server.

So iX has (quite reasonably IMO) decided that SWAP should be turned off in Dragonfish and later versions of SCALE, and have a phased implementation for turning it off and then recovering the 2GB per drive of space.

But with a new build, you can set the reserved swap space to zero before creating pools, and for Cobia and earlier you can also run swapoff -a as a post boot command to turn swapping off completely.

1 Like

P.S. @ZPrime Your proposed disk usage seems sensible.

You won’t need SLOG for TimeMachine (because that is a background task) - it will likely be beneficial for SMB (or NFS) writes from your Mac (but not from Windows).

The only change I might think about would be to use the PCIe slot for an older GPU for Jellyfin transcoding rather than using it for SLOG and hosting Jellyfin elsewhere.

P.S. I run Plex which does not use the GPU for transcoding - and for my own use occasional batch transcoding using the CPU (which is far less powerful than yours) is perfectly fine. IMO you will really only need a GPU for real-time transcoding that needs more CPU than the Atom you have i.e. transcoding an 8K movie to something smaller to watch on your phone - and if you plan ahead you can pre-transcode it anyway.

It’s complicated.

In SCALE Cobia and earlier (this doesn’t apply to CORE), ARC was limited to half of your RAM, due to a variety of issues, some really there, some incorrectly understood to be there. In larger systems, this could mean that dozens or even hundreds of gigabytes of RAM were being wasted.

Dragonfish attempted to bring RAM usage more in line with CORE and removed this limit on ARC. This unexpectedly resulted in parts of the system being paged out to swap unnecessarily. To resolve that, 24.04.1 pretty well eliminates use of swap. It still, by default, creates the 2 GB swap partition on disks when you create a pool–though that was only really intended as “slop” in case you tried to replace a disk with one slightly smaller–I don’t regard this as an inconsistency. What I do regard as an inconsistency is that it still offers to create a 16 GB swap partition on the boot device.

I could always remove the Radian and get an NVMe → PCIe card… but that could only host a single NVMe I assume. (Unless the slot on the Supermicro board that iXSystems used supports bifurcation to turn it into 2x2 instead of 1x4, which seems unlikely?) I would not feel safe with an sVDEV unless it was a 3 or 4-way mirror, since I know how important metadata is to a pool, and since I assume the amount of writes it sees has the potential to kill SSD flash relatively quickly. I do like the idea of being able to have the SSDs help with performance on the main array though, especially with the decent size of my music collection (~15k+ files across several top-level folders, broken down into further folders by artist and album). Storage-wise it’s not a lot of data, but bunches of relatively small files (1-5MB for the lossy library, and 20-50MB for the lossless) and I know directory traversal performance isn’t always ZFS’s strong suit in this scenario. Is a 4TB sVDEV “sane” for a ~45TB spinning array?

It’d be really cool if ZFS supported tiering, to put things with more frequent access on SSD while putting colder data on spinners.

Sorry! I’m on Dragonfish 24.04.1.1 (latest as of when I started the post).

Nothing personal, but when I see a Linux kernel dev who has specifically worked on the memory management architecture say that systems should have swap… let’s just say that I put a lot of weight on the kernel dev’s opinion. :slight_smile: Even if the swap space is far less than the physical memory, it seems as though modern kernels can benefit from having it available rather than not.

Valid suggestion. Due to the physical size constraints of the system, and the fact that it only has an x4 slot, a GPU may be somewhat unrealistic though. Plus, I got the Radian card for free, I don’t have a suitable GPU just laying around and I’ve already blown way too much money on this system :stuck_out_tongue: I’ve already got an Intel i5-8xxx passive system (Protectli) with an IGPU that I think could work for transcode, so I would just need to have that system mount the videos from the TrueNAS machine… Only problem would be that the i5 doesn’t have 10Gb NICs, although it does have 6x1Gb that could be potentially all be used together, either bonded or individually (don’t know if that would help NFS performance, and I’m not sure if Linux can do SMB multichannel…)

Ok, this makes a fair amount of sense to me, a reservation to allow for disk sizing concerns. If this is truly one of the intended reasons behind it, I feel like it would be more logical to use a different partition type so it causes less confusion!

1 Like

This neatly summarizes my thoughts on swap.

A little bit is useful for recovering the memory used by long running allocations which aren’t needed to be live for now.

I think it’s a shame that swap has been totally disabled in 24.04.1.1, and afaict, there is no easy way to re-enable it (swapon doesn’t seem to work)

As was discovered the issue was not swap per-se, but rather lru_gen incompatibility with the arc changes and iX not dogfooding with swap enabled.

4 Likes

Me too, I prefer swap and while I agree on a live system you don’t want a lot of paging, OTOH, we’ve seen some crashes reported here lately with OOM, which we knew were coming. Not many that we know of, but, I’d rather have a slower but running system that a total crash myself. Either way, there is an issue to be solved. I am pretty sure they did not say they would never enable it again. On a correctly functioning system with reasonable swappiness setting, there should be zero issues and no harm with swap being enabled, and only a benefit. I understand completely turning it off for now and deciding later how best to handle.

1 Like

As a a long term Linux user I’d like to see a way to enable swap. For example I partitioned my boot drive so I could create a mirrored mdadm disk to use for swap so I have a bit of additional headroom. My TrueNas box is basically SoHo grade so only has 32GB of ram at present so 8GB of Swap can be useful.

I just upgraded to Dragonfish and now I can’t get my swap to re-enable.