Dragonfish swap usage/high memory/memory leak issues

Have been testing this with zfs_arc_max=0 & zfs_arc_sys_free=0 which I believe are the defaults. ARC does seem to shrink as needed (when testing with less than 10GiB free and running head -c 10G /dev/zero | tail). I’m running under an intensive workload so will keep ARC as is and update if anything goes kaboom :boom:

2 Likes

I’ve been watching this thread with interest. I too have had similar issues with slowness of the UI and when using an SSH terminal among others. I installed the nightly version of Dragonfish-24.04.0-MASTER-20240507-013907 hoping that it would fix my problems with the serious lag that I had been experiencing in my workloads. For the last 24 hours or so this issue has not returned for me and the performance is the same as Cobia 23.10.2. I will say that for both the release version 24.04.0 and this nightly, I DO have the ‘install-dev-tools’ developer mode enabled which disqualifies me from troubleshooting or helping here. In short, for me the slowness issues seems to have been fixed in the latest nightly build. I’d be happy to help troubleshoot further and share here if necessary.

Sure, have disabled the 50% ARC limit on my SCALE system. If you can confirm the correct way to disable swap entirely (if it’s not this post) I can reboot the system afterwards and test for another day or two.

Have been writing to disk for the last 6 hours with swap disabled, didn’t hit an OOM condition, and looks like it has been resizing itself properly which is great. No sparks or explosions to speak of.

I just tested this and it doesn’t look to disable swap after a reboot (which makes sense as I believe changing this only affects new disks, correct me if I’m wrong!), I’d just run swapoff -a as an init script.

1 Like

Hmmm, I have been experiencing some puzzling issues with tdarr, but given I’m new to TrueNAS thought it might have been a Truenas thing I just hadn’t worked out yet.

My setup is such that I have my main truenas box with the files to be transcoded, which hosts a cache disk and then a separate box running opensuse has another 4 transcoding nodes, which also transcode to that cache disk connected by SMB / NFS. I have been getting random transcode failures with not much to go on in the tdarr logs. I moved from SMB to NFS to no avail. Repeat attempts at transcoding then succeed for some of the files while others still fail, until the next run which is another lottery.

Reading the above I realise I should have checked the system logs. This issue is extremely repeatable, so if you think this is likely the same bug (how do I check) then happy to help.

I haven’t noticed any GUI issues and I have 96GB RAM. I did notice one VM wouldn’t start with high RAM usage taken up by the new ZFS eats all the RAM design to match TrueNAS core which was annoying.

Indicator would be if you’re seeing an unusually high amount of system swapping with a good amount of free memory available.

Realistically you should be able to ignore that VM warning if the majority of your memory is in use by ARC as it should resize itself as needed, though I’ve found it one of the best ways to reproduce the memory/swap issues mentioned above without a fix applied ;).

If you’re looking to test, try disabling swap as suggested by iX and seeing if that fixes the issue.

Cheers for that.

After seeing this link, I’ve gone ahead and run the command as well as added it as a post-init script.

I’ve just gone and rebooted my SCALE host, looks to persist ok. Will report back in a couple of days. It’s worth noting that up until this point though, the system has now been running for >2 days entirely without any issues.

After this reboot just now, the ARC limit on my system (previously 50%) will be gone, and this will now be testing only with swap disabled. I’m wondering if at this point iX could just adjust the ZFS RAM behaviour to limit it to 75% instead and call it a day.

1 Like

After deeper digging into the problem, it looks to me caused not only by increased ARC size, reduction which according to some people may not fix the problem, but also by Linux kernel update to 6.6, which enabled previously disabled Multi-Gen LRU code, written in a way that assumes that memory can not be significantly used by anything other than page cache (which does not include ZFS). My early tests show that disabling it as it was in Cobia with echo n >/sys/kernel/mm/lru_gen/enabled may fix the problem without disabling swap.

10 Likes

:point_up:

This would be the best long-term solution.

It’s good to allow the system to swap for those extreme situations where memory demands are too high, and it needs a safety cushion to prevent a system crash or OOM.

Even though I’m on Core (FreeBSD), I don’t disable swap. Even if it never (or rarely) is needed. The cost is inconsequential, yet it could save my system from those rare (but possible) situations.

Best of both worlds for SCALE: ARC behaves as it was meant to, without restriction and without causing system instability, yet there is a swap available if it’s ever needed.

@mav: To complement this fix (disabling lru_gen), wouldn’t it also make sense to set swappiness=1 as the default for SCALE?

3 Likes

The swappiness=1 setting is quite orthogonal to this issue. It balances page cache evictions between anonymous pages swapping and file-backed pages writing, while the problem here is in balance between page cache general and ARC, to which this tunable does nothing. That is why it did nothing to this issue by itself when tried.

2 Likes

Hence, as something additional to complement the changes/fixes you’ve mentioned. (Since even @kris would go so far as to disable swap entirely, regardless of anything else.)

I think swappiness=1 would make for an ideal default[1], with the above lru_gen fix.


  1. As opposed to leaving it at the default of 60, which I don’t believe anyone believes is proper for a NAS. ↩︎

Formally, I could say that some never used memory may worth to be swapped out to free memory for better use. But I have no objections.

2 Likes
root@titan[~]# cat /sys/kernel/mm/lru_gen/enabled
0x0007
root@titan[~]# echo n >/sys/kernel/mm/lru_gen/enabled
root@titan[~]# cat /sys/kernel/mm/lru_gen/enabled    
0x0000

Does that look right?

And I assume this should actually be done as pre-init script?

EDIT: reading the docs,

https://docs.kernel.org/next/admin-guide/mm/multigen_lru.html

It appears that “echo 0” (zero) would be more correct

1 Like

I rebuilt a VM this morning, with fresh install of 24.04, restored my config, moved the HBA over with my datasets, and did this.

echo n >/sys/kernel/mm/lru_gen/enabled

All up and running fine at present. Before it would fail in a matter of hours, so should be able to tell pretty quickly. Turning off swap it worked fine the last few days, so interested to see how this goes. Will report back.

Thanks
CC

4 Likes

Its survived 6 hours - current performance is great. GUI is responsive during large transfers, which it was not before, and htop not showing any swap being used.

So far so good

CC

5 Likes

Has anyone else noticed since trying these fixes Dragonfish still won’t use up to the ARC max? When I downgrade to Cobia it quite happily uses the whole 32GB (50%) and sticks to it, but when I either disable swap in Dragonfish or disable lru_gen it starts high and then becomes very reluctant to use ARC at all.
image

This is on a machine with 64GB of RAM

Another example

The sections with the flat line are when the system is running Cobia, the sections with ARC all over the place are Dragonfish.

2 Likes

Have applied this to my SCALE system just now after a fresh reboot. Will see how it goes. In case it’s of any relevance, I was in the process of posting back to the other thread after seeing the latest developments in this one and removing the previous ZFS RAM usage limit I’d put in place but disabling swap seemingly resolved the issue entirely for my system; it has been stable for roughly 5 days now without issue.

I’ve since re-enabled swap, double-checked there’s no ZFS RAM usage limits in place anymore, and will see how the system goes with this latest fix over the coming days.

I’m hoping that from a retroactive or process improvement standpoint, especially for major SCALE releases that see things like large Linux kernel version jumps, this issue results in more thorough testing/QA or more investigation being done prior to future releases from iX. I won’t dwell on it since it looks like a fix is incoming, but would like to reiterate that these issues were reported by people in multiple Jira tickets in the days (and in one case, weeks) prior to DF’s release.

Some of these tickets were, IMO, ignorantly, closed as being a TrueCharts issue when it wasn’t the case. DF’s release should’ve been pushed back by 1-2 weeks while investigations occurred. This, coupled with ARC GPU drivers ultimately not being included in the release (a large factor in people’s desire to upgrade or migrate to DF) has marred what could’ve been the best SCALE release to date.

Here’s hoping ElectricEel is an electrifying one.

7 Likes

Very well said, @bitpushr. :clap:

Got any links to those jira tickets where this was reported before we publicly released 24.04.0?

After more thinking we’ve included both into 24.04.1 release. Should be in the next Dragonfish nightly build.

4 Likes