Very slow WebUI - Login, Apps, etc. SCALE Dragonfish RC

But swap does serve a very important purpose to prevent OOM during extreme situations.

Why would it be okay to just allow the system to hit OOM without even the possibility for a little bit of swap to be used to prevent this, in which the system can continue to run normally and eventually hold everything in RAM once the load is reduced?

I’ll try not to use the “F” word, but it seems this is an issue of the OS and how it handles swapping, rather than the presence of available swap in the first place.

This is why I disagree with outright disabling swap. It could cause issues for users with VMs, Apps, and large data reads, in which they might run into situations where their RAM’s capacity cannot hold everything at certain peak usages.

At least even a tiny bit of swap (such as 1 GiB) can provide a temporary safety cushion to avoid OOM.

EDIT: And are we sure that setting the swappiness to “1” doesn’t result in the same benefits being seen as when swap is completely disabled? (If even at a value of “1” it is still aggressively used, then goodness, the Linux kernel developers need to fix that.)

1 Like

We may make that presence of swap configurable, again that is the debate.

Personally I don’t want that cushion. I’d rather fail early and fail hard, instead of entering a state where the system just starts performing poorly and exhibiting odd behaviors on top of it. Much easier to recognize an OOM and figure out why that is, vs the alternative which could be going catatonic in random places :slight_smile:

The legacy reason we had swap in FreeNAS/TrueNAS before was for kernel crash dumps and less about OOM protection. But we’ve not used that in a LONG time, in case anybody was curious. I’ve been personally running systems here for a long time (BSD/Linux + ZFS both) without swap. ARC tends to just do the right thing and shrink when needed. If it cant shrink further, then you’ve got major issues elsewhere. I.E. too many apps, or some memory leak in a userspace service that is going to take the system down rudely either way.

1 Like

Does swap actually prevent OOM or just add an additional cushion? If a process manages to leak until it exhausts all RAM and moves to swap, it’ll eventually consume that as well and still OOM, won’t it?

I don’t think disabling swap should be the permanent solution even if I have no desire to use it. I’m curious if something with the new ARC size handling is the cause of the problem, or if the root problem has been around for a long time but never reared its head due to the smaller ARC limit? I’m leaning towards the former – I’d expect a number of people increased ARC prior to the auto-sizing and don’t recall seeing this problem discussed.

kris-vs-ram

2 Likes

Agreed. But I was referring to “peak” situations of a user with VMs, Apps, high loads, etc, where the RAM’s limit is temporarily reaching a point where it might need to swap out, rather than OOM.

(Yes, a memory leak would just delay the inevitable, so that requires fixing the culprit rather, than relying on swap to save you.)

Hey, I didn’t know they were taking pictures at work that day!

3 Likes

Not to stray too much off topic, but is any of this being communicated to the Linux kernel devs?

Even if we get the best of all worlds for SCALE[1] (i.e., FreeBSD-like ARC behavior + no more issues with swap or crippling slowdowns), I think Linux needs to do some serious rework of their memory management if simply using ZFS + a very high zfs_arc_max causes the system to needlessly swap to disk.

EDIT: To be more clear on this point, it’s silly that removing the very presence of swap allows the kernel to behave more sanely.

It’s like having a dog that tears up all your furniture and attacks your guests because you have a box of treats visibly set on the table. But then if you remove the box of treats from the house, your dog starts behaving properly. (Even though it always could have this entire time!)


  1. It’s looking to be the case that we might have a longterm solution for SCALE soon, without sacrificing the benefits of the ARC. :sunglasses: ↩︎

1 Like

Once we get a bit further and understand the deeper “why” then perhaps. Whats odd is that swap is still being used even when plenty of free memory is still available, ARC or no ARC. That’s the behavior I’d like to understand fully. Zero reason to swap at all if you have plenty of RAM still to spare.


  1. It’s looking to be the case that we might have a longterm solution for SCALE soon, without sacrificing the benefits of the ARC. :sunglasses: ↩︎

1 Like

See my “misbehaving dog” analogy in my previous post, which I edited. :wink:

I just set my swappiness to 1 from the default of 60 and re-enabled swap. I’ll monitor over the next few days and let ya’ll know what happens.

Yesterday, before disabling swap, I had also limited my arc cache to 175 gb out of my 228G available- wanted to make sure I didn’t have any OOM issues so I trimmed that way down. Unlike @kris , I didn’t want to fail early or hard :stuck_out_tongue: Perhaps I can bump up my arc a bit more if this proves stable.

root@storage01[/sys/module/zfs/parameters]# cat /proc/sys/vm/swappiness
60
root@storage01[/sys/module/zfs/parameters]# sysctl vm.swappiness=1
vm.swappiness = 1
root@storage01[/sys/module/zfs/parameters]# nano /etc/sysctl.conf

No VMs on my TrueNAS scale box, just some NFS and SMB shares, and a few k3s apps like MinIO.

This would be with ARC high-water (zfs_arc_max) reset to the Dragonfish default (that is: RAM minus 1 GiB)?

If so, would be very interesting to see how it behaves.

I can do that. Should I hard-code the zfs_arc_max to ram minus 1 GiB or should I set it back to 0 like it was before I started fiddling?

Set it back to 0 please, lets see how it runs with defaults in place.

2 Likes

Technically, I don’t think you need to “set” it to anything. Just remove the custom init command, so that the next reboot will use the default value.

Ok, from cli I did this:

[storage01]> system advanced update kernel_extra_options="zfs_arc_max=0"

I also set it in /sys/modules too:

root@storage01[/sys/module/zfs/parameters]# cat /sys/module/zfs/parameters/zfs_arc_max
187904819200
root@storage01[/sys/module/zfs/parameters]# echo 0 >> /sys/module/zfs/parameters/zfs_arc_max
root@storage01[/sys/module/zfs/parameters]# cat /sys/module/zfs/parameters/zfs_arc_max
0

However, arc_summary still has high water at 175:

root@storage01[/sys/module/zfs/parameters]# arc_summary

------------------------------------------------------------------------
ZFS Subsystem Report                            Wed May 08 08:18:24 2024
Linux 6.6.20-production+truenas                                  2.2.3-1
Machine: storage01 (x86_64)                                      2.2.3-1

ARC status:                                                      HEALTHY
        Memory throttle count:                                         0

ARC size (current):                                    50.6 %   88.6 GiB
        Target size (adaptive):                        50.9 %   89.1 GiB
        Min size (hard limit):                          4.1 %    7.1 GiB
        Max size (high water):                           24:1  175.0 GiB

Anything else I should do?

This is a production box for my small business, so I won’t be able to reboot it for a while.

Because you didn’t reboot.

Resetting it to “0” does not immediately apply.

An alternative would be to explicitly set it to the exact amount that is 1 GiB less than total RAM, which should theoretically apply immediately.

But the “cleanest” way to test this is to remove anything “custom” and reboot, which will use the Dragonfish defaults.

Gotcha.

Yeah, unfortunately not in a position right now where I want to reboot this box. It’s not a “mission critical” box for me, but it would cause some headache if I were to reboot right now :stuck_out_tongue:

root@storage01[/sys/module/zfs/parameters]# cat /proc/meminfo | grep MemTotal
MemTotal:       239258352 kB
( 239258352 - ( 1024 * 1024 ) ) * 1024 = 243926810624

Should I set my zfs_arc_max to 243926810624?

Edit- I also removed my cron job and post init script so it would go back to default whenever I can reboot next.

Math + binary vs decimal + large numbers = scared Winnie :fearful:

Yeah? I think that’s correct? :grimacing:

1 Like

Don’t worry, the iXsystems SVP of Engineering has some words of comfort:

5 Likes

Hahahahaha :stuck_out_tongue: Thanks!

I’ll stare at it a bit more before pushing go.

Question- what would theoretically happen if I set zfs_arc_max to more than what ram is available? Would I fail hard, or is zfs smart enough to not exceed the available ram?

1 Like