TrueNAS SMB performance significantly worse than others on same hardware

TrueNAS CE Fangtooth 25.04.1, bare metal installed on server hardware. 256 GB of RAM, dual socket Intel Xeon CPUs with 8 cores and 16 threads per socket.

Post edited to clarify the issue is only in specific use cases and not in 100% of performance metrics

We are attempting to transition to TrueNAS for our SMB shares, but the performance is significantly worse than that of other systems IN SPECIFIC USE CASES. I’m hoping there are some optimizations or tuning available that we have not discovered, because as is, TrueNAS is failing us.

Hardware has been thoroughly tested and proven to be irrelevant. We have tested across multiple server builds and multiple configurations, confirming on all of them that the storage subsystem is NOT an issue. We are on 10 Gb networking, using 100% flash storage, and IOSTAT confirms the storage is mainly sitting idle. CPU differences do not seem to matter either. We are using all server-grade hardware, but we have even conducted some testing on workstation-class systems.

ZFS pool configurations do not affect our tests, as multiple configurations yielded the same results. Storage is not the bottleneck in a 12x SAS SSD system.

What we see is that not only is the TrueNAS SMB performance much worse than Windows Server, but TrueNAS also falls off a cliff when doing concurrent directory enumerations of folders containing large numbers of files. The CPU activity looks like TrueNAS SMB does not spread the workload across multiple cores, as we only see 1 CPU core spike at a time, but which core spikes does shift over time.

Faster base clock CPUs provide a small performance boost, but the number of cores provides no benefit. When moving the test to higher clock rate CPUs, the times decrease slightly, but the percentage increases between Windows Server and TrueNAS hold steady.

I welcome any and all advice from the expert community. We are new to TrueNAS and struggling with this one.

TrueNAS SMB Server Test Results

Single-threaded enumeration

Time : 1,369 ms
Directories: 3
Files : 9,729

Concurrent enumeration: 8 threads

Thread ms Directories Files
1 3178 3 9729
2 4686 3 9729
3 5453 3 9729
4 5849 3 9729
5 6590 3 9729
6 6869 3 9729
7 6898 3 9729
8 6832 3 9729

Average per-thread time : 5,794.38 ms
Total concurrent time : 10,931 ms
Directories per run : 3
Files per run : 9,729

Windows Server Shares Test Results

Single-threaded enumeration

Time : 332 ms
Directories: 3
Files : 9,729

Concurrent enumeration: 8 threads

Thread ms Directories Files
1 270 3 9729
2 244 3 9729
3 259 3 9729
4 280 3 9729
5 307 3 9729
6 241 3 9729
7 231 3 9729
8 230 3 9729

Average per-thread time : 257.75 ms
Total concurrent time : 1,955 ms
Directories per run : 3
Files per run : 9,729

SMB is single threaded, and what you see is probably a shift from core to core because of heat.

Although it is hard to tell with such little information provided, it seems like the CPU really is your bottleneck here. Next question would be what the workload is, if the testing scenario is realistic, and why you did not use a cheaper HDD RAIDZ2 with a 3 way mirror as special vdev.

1 Like

Our test is an accurate simulation of real wold use of the SMB share in our environment.

Additional research has suggested that ZFS is NUMA-unaware. This is likely contributing to the performance issues, but it is not the only cause. The numbers I posted originally are for reference and are from a single-socket system. Our actual numbers for the dual-socket system are worse. We have tried HDD setups, We saw no performance difference with any special VDEVS on a 100% SSD pool. They do help on the HDD pools. This is all about directory enumeration.

Did you set any compression on your dataset?

I typed up this big thing but realized this is out of my league. The enterprise guys are in here occasionally, and with such a business focus, they should know. Maybe asleep right now. They sell some boxes of their own, I’d expect them to be good at things like this.

It is using the default LZ4, we didn’t change it.

While that isn’t an answer to the question at hand, I highly doubt that.

Is one user really opening 8 folders at the same time? Or are 8 users opening one folder each?

Of course not! You are misunderstanding me. I am not saying you should use special VDEVS because they perform better, I suggest you use them because they perform the same (for that use case) but are cheaper and bigger.

That is to be expected for a single core application.

1 Like

While I appreciate the input, I should not need to fully document our workloads and systems to demonstrate I know what I am talking about. Our SMB shares are not accessed exclusively by users. We have legacy systems and websites pulling the data as well. We have a very large infrastructure with 100+ users.

We have fully documented a real world negative impact on our systems by switching to TrueNAS SMB, while attempting to move away from Nutanix Files in their hyper-converged platform. We tested on Windows Server just to see how it performs vs TrueNAS. We have conducted numerous test, in various ways, using multiple methods. I only shared one set of numbers for reference.

Sure you don’t! You can instead ask a question. Like you did. But sometimes it is hard to find the right question. Like you asked, if it is expected for SMB to achieve these results. I can only answer that with:
Yes, on a poor single core performance smb tasks that probably not represents your use case at all (since like you said, there are +100 users), on a slow and old (just assuming, you did not specify) Intel XEON, this performance is to be expected.

No we could get into the weeds of how this test is not representative of your real world usage, how a sub 1k $ ghetto build outperforms that, what SMB settings you use and so on. Of course only if you are happy to fully document your workload :wink:

And please remember that this is a community forum. The better place to ask these kind of questions would be iXSystems.

1 Like

Sara, your assumption that this depends on hardware is flawed. I clearly stated we tested multiple servers, and the results as percentages are consistent across builds even when the numbers change. I am not asking for an opinion on our workloads. We know how the server is accessed, and the numbers represented show the real world impact switching to TrueNAS had on our systems. The directory enumeration scenario you question is one we encounter multiple times per hour.

Since we are testing Community Edition to see if TN can work for us, these community forums are where we must turn to for input. I would gladly pay IXSystems if they would provide support outside of Enterprise Edition, which they do not (or we couldn’t find how they do).

This is a documented problem of TrueNAS running much worse than Windows Server on the same hardware, for the same workload. It is purely a TrueNAS SMB problem which i am looking to solve. The hardware does not matter, our workload does not matter. The “problem” is confirmed and documented.

I am looking for insight as to whether it is known that TrueNAS SMB performs worse than Windows Server and can this be solved via configurations. If this is a problem that no one else encounters due to different workloads, I can understand that. However, trying to pass this off as a problem due to my hardware or workload is waisting both our time. Neither are going to change.

If you do not believe my numbers represent a real world usage scenario, you do not need to comment on my request.

No, not really. My assumption is that you measure “Mist”.

We have a saying in German:
Wer misst, misst Mist.

Who measures, measures crap.
The words misst (measure) and Mist (crap) are spoken the same, this is a wordplay.

So one user does 8 directory (each with 10k files) enumerations at the same time?

Then why are you not sharing links to these problems you are referring to and why are you opening up a new thread instead of discuss it in said links?

Depends. Can be worse or better. One is samba one is smb.

1 Like

I agree. Request some paid consultancy from iX systems to diagnose the issue.

Alternatively, as I have already suggested on Reddit, you should research / ask questions on expert forums relating to samba tuning rather than ask such questions about TrueNAS in a TrueNAS specific forum.

However, if you do find some tuning advice, the other problem is that up to 24.04, the SMB service had a configuration field Auxiliary Parameters which allowed you a little ability to tune samba. But as of 24.10 this field was removed.

1 Like

Is it an enumeration of the same directories (during the hour)? And how did you exactly measure enumeration times?

are u using default smb.conf from truenas? if so - you need tune it obviously.

1 Like

I do not have comparisons between TrueNAS and Windows Server but we do have comparisons between SMB clients for directory listings. See SMB Directory List Times | TrueNAS Documentation Hub

I am not aware of any other performance data on directory listings using SMB.

5 Likes

https://wiki.samba.org/index.php/Performance_Tuning

See “Directories with a Large Number of Files” - (of course your forcing everything lower case, and you lose case sensitive.

@Duratori It’s probably implied, but were the tests done with the same client? Which client?

Thank you for this. Had not seen this and it is good info to know. Our clients are mostly Windows and we are focused on solving the problem for our Windows Servers running legacy code that polls the SMB shares in this way.

A mix of windows systems were used. Windows Server, Win 10 and Win 11. Mostly Windows Server was used as this issue impacts some legacy code running on Win Server 2022.

What do you get if you just time the directory listing locally from the TrueNAS system itself, both with and without SMB in the way?

Eg: time ls -la and time smbclient //server/share -U username -c "ls" ?