I have googled and found the odd highly specialised post from some of the long long term members from the old forum, I’m hoping some exist on here.
I’ll try to summarise in short lines as I tend to go on and on otherwise.
I have one large pool consisting of 6x16TB CMR disks, Z2
NAS has 0 faults or errors I’m aware of, has worked for years, this is my 10’th year on TrueNAS
Pool particular pool is only 6 years old, was re-created when I got this board in 2018
One particular filesystem / dataset is frequently really slow to work with, PARTICULARLY in regards to getting information about the contents of a folder / directory listing
Compression is off, dedup is off for this dataset
Regularly clicking my “P:” if I haven’t used it in a while will take from 2 to 8 seconds to produce a file listing.
So I just found an old (12 hour old) Windows Explorer window open inside a folder, on a mapped drive.
I found this window, clicked one level deeper on the folder structure, into a folder with NO special characters, a single video file in it, 300MB
To simply open this folder took approximately 10 seconds, a simple directory listing, over SMB to Windows Explorer
Once the ‘pool has woken up’ I can click around the filesystem quite quickly.
It is genuinely as if the drives ‘spun up’ after sleeping. (I do not believe this to be possible?!)
Can someone recommend to me how to really benchmark this baby properly so I can attempt to cleverly isolate the issue. I know how to perform a variety of benchmarks from years of messing with PCs but I feel like some kind of specialist test is in order here which will target something more specific.
Specs: Denverton 3758 (The 8 core?) 64GB (up from 32GB! I thought this might help)
Seagate 16TB disks which passed a plethora of extensive tests upon purchase, no SMART / ZFS / scrub faults.
I will note the only interesting thing and I can’t be sure if it MEANS anything, I’m running a copy of a 120GB file from DATASET #1 to DATASET #2 (the bad one) on the same pool, obviously causing a disk thrash.
gstat -dp is showing me that generally ada2 and ada5 are slightly more busy than the others, it might be nothing as it’s fluctuating around, but they do seem to be consistently above 60% usage during this giant copy.
Bear in mind, I don’t fully know what I’m doing, I’m mid tier at best, but I’m wanting to identify this issue clearly without spending hours collecting and collating a variety of data when only 1 or 2 specialised tests might reveal the issue.
(it took 9.5 minutes, to copy 120GB from M: to P: - same pool)
(it took 10.1 minutes to copy 120GB from P: back to M: - same pool) NOTE: this was via ssh command at the filesystem level, not explorer, not SMB.
I am assuming this is a Windows 11 PC that is slow?
I think this is a Windows (Microsoft) deficiency.
A lot to do with Microsoft collecting user files details which are mostly/only pertinent to Microsoft.
See if any of the caching… is enabled on you Windows PC.
For instance, if I have a mounted SMB share which is offlined, Windows will crawl to a stagering slow when using exploroer.
This occurs on both Windows 10 and Windows 11 and has been occurring for over a year.
Furthermore the issue is only applicable to one particular dataset, whereas the others, even at their slowest are at least 2 or 3 times more responsive.
It just took 7 full seconds, to open a folder on that dataset with 434 items in it.
I’d start by looking through some of the browsing suggestions compiled in this thread started by Cyberjock in 2015. Some of the suggestions may be out of date by now or mitigated by features that were added to TrueNAS since then like sVDEVs. That said, they’re still worth looking through.
I would start by seeing how responsive a directory listing is from the TrueNAS console i.e. is it a disk performance issue or a SMB network performance issue?
Also, I would check Reports to see what your cache hit ratio is.
10 Minutes for 120GB = 12GB/minute = 200MB/s. That’s 50MB/s per data (HDD) drive… that’s very good.
ada1 on third table is at 77ms per Read. but only 4kBps? I guess its only reading metadata and nothing is cached. Is the file a small or larger record size?
If I recall, I was copying a 110GB file in order to produce these results, I always assumed the data was (kinda) evenly distributed across all 6 disks in a Z2 config but it seems that may not be the case.