Build requirements for VM backing storage

I’m looking at migrating my XCPng cluster from local storage using m2 drives to shared storage using u.2. I currently primarily use TrueNAS for bulk storage and I know that the constraints are going to be different. I’m planning on using NFS but I can go iSCSI or something else if the performance will be better.

I know I need to use mirrors for IOPS but in terms of CPU, do I want more cores or faster cores?

Do I need a SLOG if I’m using u.2 mirrors? Metadata vdev?

How limited will I be with 10G? Will moving to 25G significantly increase my hardware requirements? I assume jumbo frames and separate storage network are a given.

More RAM is always better, but should I be considering L2ARC?

Thanks.

iSCSI

Resource - The path to success for block storage | TrueNAS Community
Resource - Why iSCSI often requires more resources for the same result | TrueNAS Community

If you do iSCSI you likely want to use a SLOG: find something with good endurange and great performance at mixed operation… usually optanes are the most common reccomendation here.

SLOG

Some insights into SLOG/ZIL with ZFS on FreeNAS | TrueNAS Community
SLOG benchmarking and finding the best SLOG | TrueNAS Community

I do not suggest the use of metadata vdevs. Maxing out the RAM before going L2ARC is the suggested approach; do not consider L2ARC at all until you have at least 64GB of RAM.

About the network, it really depends on how many drives are you going to use… basically it boils down to your use case: you will be hardly limited by a 10Gbps network in a homelab, but you might be in an enterprise environment. Fiber over Base-T helps reducing the latency, thus increasing the performance.

Networking

10 Gig Networking Primer | TrueNAS Community
Resource - High Speed Networking Tuning to maximize your 10G, 25G, 40G networks | TrueNAS Community

I’ve been running an E31271v3 with 32G of ram and a bunch of enterprise SATA SSD mirrors. I’m using TN Scale to provide an NFS mount over 10G to the XCP-ng cluster. It works okay but I have sync disabled in order to get decent performance.

I’m in the process of upgrading to 25G for the nodes and 100G for the server. I can pick up an AMD EPYC 7601 relatively cheaply which will give me the PCIe lanes to support a bunch of NVMe drives. I’ll also be able to massively increase the amount of ram available. The 7601 is only PCIe 3 but that will still be a huge increase over the existing SATA pool.

Will this be a worthwhile upgrade or should I make the 7601 a XCP-ng node and look for something different for TN?

This is a tricky question that ultimately only you can answer, but I believe that PCIe 3.0 will be a great performance increase provided you run appropriate drives.

I agree that PCIe 3 will be a big improvement over SATA. My concern is if the 7601 is the appropriate CPU for serving NFS shares for this. If it’s going to bottleneck me at 25/100G then I’ll want to consider something else. That’s why I’m not putting NVMe in the existing e3 server.

I would say it depends, but giving you an actually useful answer goes beyond my knowledge.