We are a small film post-production house and we are running 2 Supermicro servers (32 core Xeon, 512GiB RAM, 10G ethernet, 24 spinning HDDs). One with TrueNAS Core and one on Scale. The Scale server is our work server with active projects and the Core server is for backup and archive. The work dataset on the Scale server is backed up onto the Core server daily via replication.
While the Scale server delivers OK performance mostly saturating the 10G network with single video files (Eg. ProRes4444) it struggles with image sequences, which we have to deal with often. After much research and tuning we believe that this is inherent to our setup with 3 Vdevs of 8 disks each in RAIDZ2 configuration as that only gives us IOPS of 3 single disks, which is just not enough.
The plan is to reformat the pool as 12 mirrored VDEVS to increase performance.
My checklist:
Make sure main work dataset has been sucessfully replicated
Replicate other datasets (IX-Applications, etc) to Core server.
Did I miss something? Am I overlooking some potential catastrophic mistakes? My main worry would be to somehow mess up the backup and lose all the data in the process.
My suggestion - for comment by others. If you have one or more spare drive slots
Set up a new pool for any containers (you mention IX-Applications - so I assume you have containers running) and then migrate to that pool. I believe (but have not tried) that this is possible
Consider SSD’s rather than HDD’s - you would get far more IOPS (but less space probably)
Thank you. That’s actually a good suggestions. I don’t have any spare slots but I have 2TB of nvme on a pci card. (We used it as L2ARC but turned out to be counter productive). I will try to put all small datasets for apps and VMs on that and keep them running through the migration of the pool.
And yes. We are considering to upgrade to an all NVME server but wanted to try if we can get more performance from spinning disks first.
Do note that for maximum performance you want your working pool to stay below 50% of space utilization; if you work with large media files, be sure to set the dataset’s recordsize to at least 1M.
Hi Everyone
Just a quick update: The reformat and restore of the main pool all went flawlessly and the few apps and VMs are also working again. The restore of the 77TB pool took around 3 days but went through without any hiccups.
But performance is not great. I have a weird issue where read speed seems to be capped at around 250MB/s. Writes are as expected above 1GB/s.
I will probably start another thread about that problem.
Thanks for everyone’s advice.