Difference in SMB Share Behaviour Between CORE and SCALE

Johnny_Fartpants · July 31, 2024, 2:55pm

Thanks @Constantin the thought had crossed my mind. Will need to do some testing on some existing large datasets and see how it is in the real world. ARC may well catch a lot of this.

Johnny_Fartpants · July 31, 2024, 2:57pm

Does feel like a regression to me however. Seems like IT in general is regressing in functionality the last few years. Anyway moan over.

Constantin · July 31, 2024, 3:06pm

If it helps, have a look at my sVDEV resource page. I have a number of planning steps, results, and so on. For my use case, it has been an incredible improvement as the write speeds nearly doubled between optimizing the record sizes for larger files, turning on zstd compression, and using sVDEV to leverage SSDs for small files and metadata. (thanks @winnielinnie !)

Another hidden gem WRT to sVDEV is being able to designate specific datasets to solely reside in the sVDEV by setting the dataset recordsize to be equal or smaller than the sVDEV recordsize cutoff. So you can have your cake associated with mirrored SSDs for databases, VMs, etc. and eat it too, by also using them for metadata and small files.

HoneyBadger · July 31, 2024, 3:12pm

You should probably link to it for convenience, it’s a good read.

Constantin · July 31, 2024, 3:15pm

Thank you, should have done that! Hope you are well!

Johnny_Fartpants · July 31, 2024, 3:30pm

There is an argument to say it’s a security issue. Files and their attributes (including their names) may contain sensitive information both concerning the file itself and also on the existence and intention of those who are allowed to access them.

Johnny_Fartpants · July 31, 2024, 3:31pm

Anyone one know the incantation to edit/add auxiliary parameters in SMB Global now its been removed from the UI?

Johnny_Fartpants · August 1, 2024, 10:24am

Ok so it looks like you can drop this into the SMB Global to cover all shares if anyone wants to do this.

From the shell run cli
service smb update smb_options=“hide unreadable = yes”

Thats about it.

Constantin · August 4, 2024, 2:11pm

I wonder if we should collectively research what SMB configs are and are not legal at this point and consolidate them in a resource page.

In years past, cyberjock and others made many seemingly-good suggestions re: disabling MS-DOS compatibility in return for faster Samba performance.

Said suggestions were discounted as obsolete / bad / etc. by @awalkerix for SCALE and CORE, with the difference being that past CORE Samba servers ignored bad SMB attributes (and per the current Jira notes, so will SCALE eventually).

As Crowdstrike has amply shown, verifying input files is always a good idea.

Johnny_Fartpants · August 4, 2024, 4:08pm

It’s a good idea. Personally I’d rather not play around with them and work with the defaults but when there has been a change in functionality from CORE to SCALE then it’s a bit of a deal breaker for me and I’d imagine other users.

Captain_Morgan · January 18, 2025, 8:06am

It would be interesting to understand the real performance impact given that there is a setting that behaves well.

In general, SCALE directory listing speed has been improved a lot
With this setting change, it will slow down.
What’s the actual impact in a real system vs CORE.?
Does a special VDEV solve the problem?
What is the incremental cost in % terms?

Johnny_Fartpants · January 18, 2025, 8:16am

Indeed all good questions and I don’t know the answer to them yet. My first point was to try and get people to recognise that this default behaviour is different to that of CORE and I personally think needs addressing. Be it a default fix or at least a UI switch that users can apply. Clearly this can be fixed but as you say the concern is the potential performance impact especially at scale (forgive the pun). I would be inclined to warn any existing CORE user especially in an Enterprise environment before upgrading to SCALE as this alone could spoil their day.

Captain_Morgan · January 18, 2025, 8:27am

Thanks for pointing out the functional fix.

I’d like your help working out the real performance impact in terms of directory listing speed.

We measured the speed an all flash system at 4 million files per minute… if metadata was in DRAM it would be fast on a disk system.

However, a MacOS client was only 6 thousand files per minute…

What file rate do you think you need?

Johnny_Fartpants · January 18, 2025, 8:33am

Wow you’ve geeked out there Capt . In my world we don’t normally measure like that. It’s more practical testing and if/when users complain which generally they don’t (regarding this).

I’ll try and find some time to grab a few sample datasets from a CORE system and copy them over to a SCALE system. I’ll then apply the ‘fix’ and take a look see and try to identify any obvious difference. The bit I can’t easily do is have hundreds of datasets with hundreds of users accessing them which is the bit that makes me nervous about simply applying this ‘fix’ and upgrading my existing systems.

Captain_Morgan · January 18, 2025, 8:40am

You have lots of systems… perhaps test on just one??

But I’d suggest starting with that listing speeed quantification. We can’t change linux easily, but if we know the speeds required, we can see what might get to that metric… or whether its already there?

ChrisRJ · January 18, 2025, 9:11am

I also wanted to emphasize this. Especially in the context of HR this is a serious concern.

In addition there are many industries where such a change will trigger the need for a re-certification. I cannot imagine any IT department would be willing to create that level of top-management attention.

Johnny_Fartpants · January 18, 2025, 4:26pm

Ok you inspired me @Captain_Morgan and after watching your kids play football what else is there really left to do on a Saturday (other than watch TrueNAS Tech Talk).

I setup a CORE system with a fresh dataset offered out via SMB and the same with SCALE (both running the latest versions) and datasets configured the same way. I created two folders on each share one with with 10K empty .txt files and the other with 50k. Edit added 100K also.

I used a Windows client and along with PowerShell ran the following command: Measure-Command { Get-ChildItem "\\server\share\10K” }

The results surprised me.

CORE: 10k = 6sec

CORE: 50k = 30sec

Edit: CORE: 100k = 58sec

SCALE: 10k = 1sec

SCALE: 50k = 6sec

I then applied the ‘fix’ i.e. “hide unreadable = yes”

SCALE: 10k = 2secs

SCALE: 50k = 10sec

Edit: SCALE: 100k = 19sec

I ran each test about 10 times one after the other and used the rough average and didn’t notice any wild differences between the tests.

So this suggests that SCALE off the bat is much faster than CORE and that even when you apply the ‘fix’ you are still much faster.

Both systems I tested on are recently retired Supermicro servers (same spec) more than capable and the pools were constructed in exactly the same fashion using the same type of HDDs to keep things fair.

I now feel more confident than I did about applying this fix to SCALE but I do still think some consideration should be put into making this default behaviour or at least provide an easy switch for other users.

winnielinnie · January 18, 2025, 4:44pm

Stop it. What are you doing?

See? Core is that much faster than SCALE, where it can list the contents of a directory in negative 30 seconds. You can’t get any faster than “the future is now”.

Johnny_Fartpants · January 18, 2025, 4:45pm

Sorry I was very surprised and you should know me by now I’m a HUGE CORE FanBoy.

Johnny_Fartpants · January 18, 2025, 4:46pm

Oops fixed