Dragonfish swap usage/high memory/memory leak issues

Sure, have disabled the 50% ARC limit on my SCALE system. If you can confirm the correct way to disable swap entirely (if it’s not this post) I can reboot the system afterwards and test for another day or two.

Have been writing to disk for the last 6 hours with swap disabled, didn’t hit an OOM condition, and looks like it has been resizing itself properly which is great. No sparks or explosions to speak of.

I just tested this and it doesn’t look to disable swap after a reboot (which makes sense as I believe changing this only affects new disks, correct me if I’m wrong!), I’d just run swapoff -a as an init script.

1 Like

Hmmm, I have been experiencing some puzzling issues with tdarr, but given I’m new to TrueNAS thought it might have been a Truenas thing I just hadn’t worked out yet.

My setup is such that I have my main truenas box with the files to be transcoded, which hosts a cache disk and then a separate box running opensuse has another 4 transcoding nodes, which also transcode to that cache disk connected by SMB / NFS. I have been getting random transcode failures with not much to go on in the tdarr logs. I moved from SMB to NFS to no avail. Repeat attempts at transcoding then succeed for some of the files while others still fail, until the next run which is another lottery.

Reading the above I realise I should have checked the system logs. This issue is extremely repeatable, so if you think this is likely the same bug (how do I check) then happy to help.

I haven’t noticed any GUI issues and I have 96GB RAM. I did notice one VM wouldn’t start with high RAM usage taken up by the new ZFS eats all the RAM design to match TrueNAS core which was annoying.

Indicator would be if you’re seeing an unusually high amount of system swapping with a good amount of free memory available.

Realistically you should be able to ignore that VM warning if the majority of your memory is in use by ARC as it should resize itself as needed, though I’ve found it one of the best ways to reproduce the memory/swap issues mentioned above without a fix applied ;).

If you’re looking to test, try disabling swap as suggested by iX and seeing if that fixes the issue.

Cheers for that.

After seeing this link, I’ve gone ahead and run the command as well as added it as a post-init script.

I’ve just gone and rebooted my SCALE host, looks to persist ok. Will report back in a couple of days. It’s worth noting that up until this point though, the system has now been running for >2 days entirely without any issues.

After this reboot just now, the ARC limit on my system (previously 50%) will be gone, and this will now be testing only with swap disabled. I’m wondering if at this point iX could just adjust the ZFS RAM behaviour to limit it to 75% instead and call it a day.

1 Like

After deeper digging into the problem, it looks to me caused not only by increased ARC size, reduction which according to some people may not fix the problem, but also by Linux kernel update to 6.6, which enabled previously disabled Multi-Gen LRU code, written in a way that assumes that memory can not be significantly used by anything other than page cache (which does not include ZFS). My early tests show that disabling it as it was in Cobia with echo n >/sys/kernel/mm/lru_gen/enabled may fix the problem without disabling swap.

10 Likes

:point_up:

This would be the best long-term solution.

It’s good to allow the system to swap for those extreme situations where memory demands are too high, and it needs a safety cushion to prevent a system crash or OOM.

Even though I’m on Core (FreeBSD), I don’t disable swap. Even if it never (or rarely) is needed. The cost is inconsequential, yet it could save my system from those rare (but possible) situations.

Best of both worlds for SCALE: ARC behaves as it was meant to, without restriction and without causing system instability, yet there is a swap available if it’s ever needed.

@mav: To complement this fix (disabling lru_gen), wouldn’t it also make sense to set swappiness=1 as the default for SCALE?

3 Likes

The swappiness=1 setting is quite orthogonal to this issue. It balances page cache evictions between anonymous pages swapping and file-backed pages writing, while the problem here is in balance between page cache general and ARC, to which this tunable does nothing. That is why it did nothing to this issue by itself when tried.

2 Likes

Hence, as something additional to complement the changes/fixes you’ve mentioned. (Since even @kris would go so far as to disable swap entirely, regardless of anything else.)

I think swappiness=1 would make for an ideal default[1], with the above lru_gen fix.


  1. As opposed to leaving it at the default of 60, which I don’t believe anyone believes is proper for a NAS. ↩︎

Formally, I could say that some never used memory may worth to be swapped out to free memory for better use. But I have no objections.

2 Likes
root@titan[~]# cat /sys/kernel/mm/lru_gen/enabled
0x0007
root@titan[~]# echo n >/sys/kernel/mm/lru_gen/enabled
root@titan[~]# cat /sys/kernel/mm/lru_gen/enabled    
0x0000

Does that look right?

And I assume this should actually be done as pre-init script?

EDIT: reading the docs,

https://docs.kernel.org/next/admin-guide/mm/multigen_lru.html

It appears that “echo 0” (zero) would be more correct

1 Like

I rebuilt a VM this morning, with fresh install of 24.04, restored my config, moved the HBA over with my datasets, and did this.

echo n >/sys/kernel/mm/lru_gen/enabled

All up and running fine at present. Before it would fail in a matter of hours, so should be able to tell pretty quickly. Turning off swap it worked fine the last few days, so interested to see how this goes. Will report back.

Thanks
CC

4 Likes

Its survived 6 hours - current performance is great. GUI is responsive during large transfers, which it was not before, and htop not showing any swap being used.

So far so good

CC

5 Likes

Has anyone else noticed since trying these fixes Dragonfish still won’t use up to the ARC max? When I downgrade to Cobia it quite happily uses the whole 32GB (50%) and sticks to it, but when I either disable swap in Dragonfish or disable lru_gen it starts high and then becomes very reluctant to use ARC at all.
image

This is on a machine with 64GB of RAM

Another example

The sections with the flat line are when the system is running Cobia, the sections with ARC all over the place are Dragonfish.

2 Likes

Have applied this to my SCALE system just now after a fresh reboot. Will see how it goes. In case it’s of any relevance, I was in the process of posting back to the other thread after seeing the latest developments in this one and removing the previous ZFS RAM usage limit I’d put in place but disabling swap seemingly resolved the issue entirely for my system; it has been stable for roughly 5 days now without issue.

I’ve since re-enabled swap, double-checked there’s no ZFS RAM usage limits in place anymore, and will see how the system goes with this latest fix over the coming days.

I’m hoping that from a retroactive or process improvement standpoint, especially for major SCALE releases that see things like large Linux kernel version jumps, this issue results in more thorough testing/QA or more investigation being done prior to future releases from iX. I won’t dwell on it since it looks like a fix is incoming, but would like to reiterate that these issues were reported by people in multiple Jira tickets in the days (and in one case, weeks) prior to DF’s release.

Some of these tickets were, IMO, ignorantly, closed as being a TrueCharts issue when it wasn’t the case. DF’s release should’ve been pushed back by 1-2 weeks while investigations occurred. This, coupled with ARC GPU drivers ultimately not being included in the release (a large factor in people’s desire to upgrade or migrate to DF) has marred what could’ve been the best SCALE release to date.

Here’s hoping ElectricEel is an electrifying one.

7 Likes

Very well said, @bitpushr. :clap:

Got any links to those jira tickets where this was reported before we publicly released 24.04.0?

After more thinking we’ve included both into 24.04.1 release. Should be in the next Dragonfish nightly build.

4 Likes

Hindsight is always 20/20. This isn’t criticism of anyone, nor iXsystems in general.

I think the lesson is that there’s probably good reason to investigate further when performance issues arise when upgrading from one version to the next, especially when it’s a major kernel version upgrade.

It’s hard for iX staff to reproduce, since a handful of people cannot emulate the different workloads or “eyes” of the community users.

Just as an example, what might alert someone to a problem is they notice their drives are “noisier” than they remember, or their CPU is more “active” than normal. This can be brushed aside by someone else as “Oh, that’s normal. You only noticed it recently.”



Looking back at these now, it’s likely they were interrelated to the RAM/Swap/ARC/LRU issue that was introduced with the newer Linux kernel.

No one’s a psychic.

It’s just that when someone is “familiar” with the behavior of their own system, it might be a canary in a coalmine for someone with more expertise to investigate further. Especially if there’s a common denominator (such as Dragonfish RC1), and it’s more than just one person reporting it, across different types of systems.

2 Likes

Twenty minutes to import your pools? Must be TrueCharts. At least they didn’t close my ticket on that.