A newbie attempts optimisation

Hi everyone,

This will be my first post and I was hoping I could get some advice about how to configure my TrueNAS Community (25.04.1) server for optimal performance.

I’m new to the world of TrueNAS and only set up my first homelab six months ago, so I can self-host everything I need, control my data, and ditch all those subscriptions. I also moved back into a tech job after a few years out (now in health tech), so wanted to refresh my knowledge and skills, and I enjoy it as a hobby. I’ve tried to learn as much as possible by reading and watching resources like T3 podcast, Lawrence Systems, NASCompares etc… as well as the support/forum pages.

My MAIN server is an old gaming PC repurposed with the following config:

Asus ROG Strix Z370F gaming motherboard
Intel i7-8700K Coffee Lake 6-Core CPU
64GB DDR4 RAM non-ECC
750W PSU
Nvidia RTX 4000 8GB GDDR6 (for transcodes)
10Gb Dual LAN Base-T PCI-e Network Card, Intel X540 Controller, NICGIGA 10Gbps Ethernet Adapter, 2 * 10Gbe RJ45 Port

DISKS:
2 x Motherboard NVMe Slots:
Slot 1: 500GB NVMe - Boot Drive
Slot 2: 1TB NVMe SSD (not yet in use)
6 SATA motherboard ports:
SATA 1-4: 4 x WD Ultrastar DC HC550 recertified 18TB drives
SATA 5&6: 2 x WD Red 2GB SATA SSDs
Additional Storage in ASUS HYPER M.2 Card V2, PCIe 3.0 x16:
2 x 1TB NVMe (not yet in use)
Additional SATA via PCIe 2 port SATA 6Gbps controller card:
1 x WD Ultrastar DC HC550 recertified 18TB drive

POOLS:
Pool 1 (primary storage):

  • 5-wide 18TB WD Ultrastar HDDs in RAIDZ1
  • I think a striped mirrors would be faster? but have been extending a disk at a time and could switch to that RAID arrangement later if advisable?
    Pool 2:
  • 2 x 2TB SATA WD Red as a mirrored pair
  • I use this pool for installed app configs and storing anything that I want to have more rapid access

Free disks:
3 x 1TB NVMe SSDs

So the first thing I’m trying to work out is how best to deploy the free NVMe drives.

MY USE CASE:

  1. Jellyfin Server - primary use with about 6 friends and family members using
  2. Immich - as for Jellyfin
  3. Nextcloud + Collabra - Only myself plus 2 users - for all my documents both work and personal
  4. Audiobookshelf - as for Jellyfin
  5. Cloudflared
  6. I was using MINIO as an S3 store but may switch apps due to recent changes their side
  7. Probably some companion apps for Jellyfin when I get to it
  8. Calibre library
  9. Syncthing
  10. Probably a Windows VM at some point

Having read/watched as much as I can, I’m thinking that the best use of my 3 x TB NVMe SSDs might be to set up a mirrored pair as a special metadata VDEV, with the third as a hot spare. Then I could set the threshold for metadata to perhaps 256KB so that it will serve all my metadata / small files from my media libraries (Jellyfin/Audiobookshelf et al) and my small office documents, and speed up file listings over SMB. I read the fusion drive documentation and think that’s how it would work? What do you folks think? Would this be a good approach?

I don’t see much benefit to adding L2ARC as I have a good amount of RAM for the demands (I think). Please correct me if you know better? Again for SLOG, I don’t see that writes to the server are likely to be a particular bottleneck, hence thinking that may be unnecessary.

But the main point of this post is to accept I’m a newbie and have a lot to learn, so I would love thoughts from those of you with extensive experience with TrueNAS?

Summary of my queries:

  1. Does my hardware configuration look sane and good or are there any tips/tweaks that would be recommended or things I’ve missed?
  2. Might it be a good idea to switch to 3 x striped mirrors, once I have a 6th 18tb drive.
  3. How would you recommend deploying the 3 x 1TB NVMes for best results as special VDEVs
  4. If going the metadata VDEV route, do you agree with my config suggestion, and are there any checks I can run to determine the threshold in KB to set for what is handled by the metadata VDEV for optimal results?

Thanks in advance for any pointers and assistance.

Jonathan

P.S. My other TrueNAS is running on a TerraMaster F6-424 Max with TOS removed and TrueNas installed (Fangtooth 25.04.1). It has mirrored 22TB Exos HDDs in bays 1&2 and a 2nd pool of 4x 4TB raidz1. Over time the 4tb drives will be removed in pairs for 2 x 22tb pairs. This device is mainly to backup the main server and to kick in if the main server were to fail.

Finally I have the UNAS Pro with 6 x 14tb recertified WD Ultrastar drives + hot spare. I know based on this post Possible to backup to Unifi UNAS Pro? that I can back up to it via rsync, but I don’t know how to set up the SSH keys to enable this - I really am a novice in this area (SSH), so would really appreciate some simple step-by-step guidance to establish the connection?

Thanks again!

Faster in what way? Small volume sequential read response times, bulk IOPS or bulk throughput?

Yes - it might be but you don’t say which pool you will attach it to (so I am assuming the HDD pool as it will have the most benefit).

However, I would personally set it as a 3-way mirror and not have a hot-spare.

With the benefit of hindsight this would be better as RAIDZ2 especially since they are used drives (the issue being that resilvering after a single drive failure could stress one of the other drives into failing and you will then lose your data).

I am not an expert on these. Can someone else comment on whether these are good cards to use?

Before you spend time on a special metadata vdev, why not just run for a while and add later if it is needed. Use actual data to make that decision, like the results of arc_summary command.

A few stats from my machine:

ARC total accesses:                                                 1.7G
        Total hits:                                    99.4 %       1.7G
        Total I/O hits:                               < 0.1 %     330.0k
        Total misses:                                   0.6 %       9.5M

ARC demand data accesses:                               6.3 %     107.1M
        Demand data hits:                              99.1 %     106.2M
        Demand data I/O hits:                         < 0.1 %      17.8k
        Demand data misses:                             0.8 %     903.2k

ARC demand metadata accesses:                          92.9 %       1.6G
        Demand metadata hits:                         100.0 %       1.6G
        Demand metadata I/O hits:                     < 0.1 %      19.8k
        Demand metadata misses:                       < 0.1 %     203.0k

Clearly, I do not need a l2arc, totally useless. Look at my demand metadata hits too. And note you can tune the arc, prefer more metadata, etc. So, I see no need in my case for a metadata vdev. My stats are low as we had a half day power outage so machine shut down of course, UPS didn’t last long enough.

I’d prefer to just run and see. You use case is pretty light. And you’ll need some fast storage for your potential Windows VM.

So many people keep asking about “cache” drives (what they call it) of various types. I rarely chime in, but, I think it best to just run and if later things are not well, use actual data to make that decision, not just because something is there. Simpler is easier. And keep in mind, memory is faster than SSD.

2 Likes

Agreed. Especially since OP wouldn’t be able to remove that vdev later.

1 Like

Hi @Protopia - thanks for your advice… here are my reflections…

I thought faster for read operations as lets say it’s 3 x mirrored pairs, then reading a large file can draw from 3 sources simultaneously, speeding up the read time compared to a raidz1? Also a while ago when I did the research (memory may be failing in old age) but also faster for IOPS?

Yes, I did mean for the HDD pool, as this is where the vast amount of data resides and I have large folders that take ages to list via SMB. I got the 2-way mirror plus hot spare idea for the metadata VDEV from a recent T3 podcast episode I think @HoneyBadger mentioned… or @kris - what would the pros/cons of 3-way mirror vs 2-way + hotspare be? I think they said that if raidz1 you should have 1 disk redundancy and if raidz2 2 disk redundancy in the metadata vdev?

My thinking was (accepting novice status) that raidz2 is meant to be slower, and I have 3-2-1 backups for everything with ZFS replication every night to the other TrueNAS and am intending to have hourly rsync updates to the Unifi UNAS pro for smaller sized document dataset changes (+ overnight ZFS dataset replication to the other TrueNAS), so less worried about the loss of the whole dataset/pool during resilvering, as should be able to recover from at least two backup sources.

What do you think?

Hi @sfatula - thanks for your thoughts… I’d love to use data as you suggest to guide the decision, but don’t know of the arc_summary command (or how to run it) - although I could look that up… Then interpreting what the output means for me is also something I would need to learn… I don’t know if you may be able to explain a little further or guide me to an appropriate resource?

Anecdotally, I have been using my setup for at least a month in it’s current configuration and am definitely ‘slowed-up’ by trying to list folders via SMB and when I try to run a tool like Beyond Compare, to compare folders its very slow. I imagine this would be massively helped by a metadata VDEV. But again, I come to ask and learn, so could be barking up the wrong tree here.

Would you say the SATA SSDs would be suitable for this (fast storage)?

Thanks for your thoughts

Thanks for your comment @dan - when you say

What would I be looking for whilst ‘running for a while’ as deciding factors on whether to proceed with the metadata VDEV or not?

Thanks again

There is absolutely no point in being able to read it any faster than you want to process it i.e. stream it.

In addition ZFS already optimises sequential reads of large files by pre-fetching the next few blocks, so when it is requested it is already in ARC and returned immediately without needing to read it from disk.

So using mirrors increases your cost (because you need a LOT more disks for the same useable storage) and doesn’t improve performance one bit.

If time to return the first block is critical, then you need to store the files on SSD or NVMe or Optane.

You can always run a regular cron job to ls -lR /mnt/hdd-pool to pre-read the directory metadata into ARC and keep it there. But a) this doesn’t include the metadata needed to read the file e.g. where the blocks are on disk, and b) the directory metadata is likely to be kept in a decent sized ARC anyway.

SMB directory listings can be slow due to the chattiness of the SMB protocol for this - and a special metadata vDev won’t change this.

The concept of at least matching redundancy is a general idea that says if data important enough to warrant double redundancy, then you should do that everywhere. Personally I do subscribe to the idea of having a 3-way mirror special vDev for a RAIDZ2 data vDev HOWEVER I don’t think it is quite as important because one of the primary reasons for RAIDZ2 is to mitigate the risks of seek stress during resilvering of HDDs causing a 2nd drive to fail - and SSD/NVMe special vDevs don’t do seeks, so I would do a 3-way mirror if I had the slots and a 2-way mirror if I didn’t.

RAIDZ2 is not slower than RAIDZ1, and only slower than mirrors for specific specialised workloads (random reads and writes of 4KB each - i.e. the profile for virtual disks/zVols/iSCSI or database files), primarily to avoid read and write amplification but also to give you more IOPS if you need them.

If you are doing sequential reads and writes to/from a reasonably sized files (say 128KB or larger), and if you have multiple reads in parallel then mirrors give you marginally greater throughput because you can read from all drives simultaneously, whereas RAIDZ can only read from the data drives. But…

  1. You need to be reading a lot of data at once to max out all drives; and

  2. Usual home use is typically only one read process at a time, and

  3. Sequential pre-fetch will normally have the data ready for you when you ask for it; and

  4. You will be spending a lot more of disks for little noticeable performance benefit.

I have a very under-powered TrueNAS box, and limited memory for ARC yet I get a 99.8% ARC hit rate and brilliant performance. And I found that network speed was what improved performance for me and NOT mirrors or special vDevs.

:thinking: I don’t know that I would have suggested a 2-way mirror + hotspare for a special. If I did say this, let me know where and I’ll make sure I put a caption on there to clarify.

Proactive replacement of SSDs due to identical wear in mirrors, absolutely - but a single vdev mirror + hotspare is worse than a 3mirror in pretty much every aspect, especially that in case of a drive failure, you have zero redundancy during the resilver.

In reading the thread, I find some stuff a bit weird on the slow access from the share unless you are trying to pull thousands of files across the network each time a directory is accessed instead of just reading the directory.

I find accessing a directory some containing a large amount of photo files really quick from the SMB share across the network. almost as quick as on the local machine. This is over a 1GB network. One directory has 6,898 under 20K files but many directories have 10- 50 mb files some thousands. The server uses spinning rust drives and no additional cache drives.

I currently run one of these with 4 NVMe drives as my VM NFS share - knock on wood, no problems yet!

1 Like

I find this highly unlikely myself. Ok, how 'bout this. In general, you can keep metadata data in 2 places, your special vdev on ssd or in memory. If in memory, what do you think is faster, in memory access or ssd access? So, memory > ssd. I would personally want to max out my memory before worrying about any ssd for a special vdev. And for my system, I can have vastly more memory. The arc_summary output is fairly readable, for the stats I posted. So, for example, when needing metadata, I have almost 100% hits. So, there would never be a case to read from ssd. But what about small files? Well, my arc data hit rate is 99.1%, again, few files would be read from ssd. And most of those .9% misses are the instant my machine was rebooted. Which I never do except for updates, or in this case a power failure.

Protopia is right, for sequential reads, zfs pre-fetches so for thing like Jellyfin, you watch it far slower than you read it of course.,

SATA ssds are great for VMs. Vastly better than spinning rust.

1 Like

Sorry - but a card working in a specific set of hardware in normal operation is not the same as saying that the card has been architected in a way that will be reliable on all hardware under all circumstances including abnormal conditions such as power cuts or brownouts or surges, or o/s crashes or …

2 Likes

Which I never claimed it was or did, just that for me in my system I have not had any issues (that does include 1 power outages so far in my house)

2 Likes

Thanks very much @Protopia, that helps a lot. Will factor your advice into my deliberations.

You’re quite right @HoneyBadger and apologies for “taking your name in vain”(!) You didn’t recommend a hot spare for a special. I picked up the idea for using the special metadata VDEV from this section of a recent T3 episode here (“Special VDEVs for small file sizes”): https://youtu.be/pr-u4fs_tXQ?si=NUE2aOCaDRPZ_KLL&t=1308 (link will jump to time-index).

You mentioned if experimenting with a metadata VDEV, first making sure you have good backups, then you might test it with a mirror metadata VDEV and if making permanent, add a further drive. My faulty memory re: the hot spare.

The use-case you described in that section however, did sound applicable to my situation where I have lots of large file-size photos, videos, documents etc… and hundreds of thousands of small docs and metadata info, e.g. old Lightroom library files, old archived website assets, .nfo files, old backups and the like, that (I think) might benefit from a metadata VDEV. Does that make sense / sound reasonable to you?

Based on the very helpful feedback from everyone on the forum here, if I were to do it, it sounds like a 3-way mirror with the 1TB SSDs would be the way to go? I understand from @Protopia that I could:

Which would partially achieve my objective via a cron job and ARC.

and from @sfatula that keeping the metadata in RAM would be faster (I think the T3 hosts said reading from RAM being nanoseconds vs. microseconds from an SSD).

Nevertheless, my concern is that my RAM is maxed out at 64GB with no further ability to expand due to motherboard limitations (until I upgrade :)), and I think I should keep that available for running apps, and would be concerned that filling it with potentially substantial metadata may not be best use of my fastest RAM cache.

I guess where I’m ending up with my thoughts on this is that now I have good backups, I could test a 3-way mirror metadata VDEV and see how things go. If it improves my performance and concerns… or if I don’t notice anything at all… I have the 3 x 1TB SSDs in situ already, and they’re not doing anything, so unless others think there is a better way to utilise these SSDs (e.g. SLOG / L2ARC / Dedup or something else) is there any reason folks would advise me against trying it? For the avoidance of doubt, I have the SATA SSDs which have more than enough capacity for my fast data storage needs, and if I notice no improvement with the metadata VDEV, I could destroy the pool and recreate it minus the metadata VDEV to reclaim those SSDs.

Any final/additional thoughts from any of you folks are much appreciated :slight_smile:

Thanks for helping me start to navigate the complexities (and power) of TrueNAS

Thanks @MBILC - I haven’t had any issues yet, but can only utilise 2 of the 4 m.2 slots due to lane limitations on my current rig

On the other hand, this is just a passive adapter for 4 M.2 NVMe drives in a x16 slot with motherboard bifurcation… There’s not much to test.

1 Like

No offense taken, it just genuinely had me trying to remember what I’d said. Some days I can’t remember what I had for breakfast. My memory is like a Dalmatian - a bit spotty. :slight_smile:

Your use case here sounds almost exactly like the example I laid out in the podcast there, where you have large files for bulk data and a number of small ones. A 3-way mirror with the 1TB SSDs would be the way I’d go here. You could then configure your dataset with recordsize=1M to allow the large files to be stored in 1M chunks, apply the special_small_blocks threshold of 128K (or perhaps slightly higher, to catch your Lightroom/etc files) and those would then be populated on the 3x 1T NVMe SSDs.

This does concern me mildly, since I never saw the details on this controller specified. An ASMedia is better than a JMicron but I’d be tempted to have all five of your HDDs on your motherboard ports, and have the SATA SSD split between port6 + one of the ports on the add-in card, since its contents might be a little more fungible (or at least easily backed up to your main pool with a scheduled replication job)

Since you’re willing and able to destroy the pool and start over, this is the time to test with the metadata vdev. Once you have a RAIDZ vdev (for your data) you’re unable to remove any vdevs (including special) without destroying the pool - but it seems like you’re aware of and prepared for this, so experiment away!

1 Like

Thanks very much @HoneyBadger - once I’ve verified my backups are fully functional, I’ll take the plunge with the metadata 3-way mirror VDEV and see how it performs. Is there a useful benchmark I could run prior to making the changes, that I could then re-run after the implementation?

I’ll take your advice once I’m home and reconnect the drives as described here. Working remotely at present.

Thanks again, and loving the podcasts.

Jonathan