Going insane, truenas webUI dies, again

It’s odd that their systems are swapping 1 - 2 GiB at any time. Why would a 128 GiB system even swap at all? You almost never saw this with Core.

My 128GB Cobia system had 700MB of swap last night.

I was running a qemu-img convert on 1TB image. ARC was full to 50% and services filled right up to 50% - 1GB. Free memory was fluctuating at about 800-1200MB.

Img → zvol

I think it’s a fairly reproducible workload which seems to peg a couple of cores (44 threads on this system). But it also seems to use all free ram up and make heavy arc/io usage

A little bit of swap can be typical. I regularly saw a few hundred MB swapping, before I disabled swap all-together on my systems. There was no real memory pressure forcing it, and running swapoff -a on my system worked fine to bring those pages back into memory. I suspect on Linux it tends to be more aggressive in swapping inactive things, but only ancetodal evidence to back that up :slight_smile:

You guys use the default swappiness of 60. Found here:

/proc/sys/vm/swappiness

That is a somewhat aggressive swapout setting.

Of course. It may not even be the same problem at all, or it could. The point is to bridge every possible symptoms together to see if they correlates, and we can find the root cause while preserving the features like dynamic ARC.

1 Like

seems we are in the same boat, large amount of data transfer and IO eventually brings the issue to the surface. For me was about 80TB read / write that happened 3~4 times

Hi, I have been experimenting with my limit, so far around 75~80% in dragonfish no issue yet. Default behavior experiences issue 3~4 times in 5 days. So your 70% from 22, 23, 24 might not cause issues. So it’s possible people who has it over 50% but stayed somewhat conservative like 75~80% or below are still free from this issue.

1 Like

The percent for any given system is based on their workload. For me, I know what memory the system uses other than ZFS arc, what the max is, over the course of a week or month. Based on that number, the rest could be arc, and I subtracted a little to be conservative. And I kept my swap just in case. But another guy with different usage might find mine too high, it just all depends. But yes, setting a hard limit is appearing to work better for some, hopefully they can figure out why as the ideal state is as it works on Core. Basically on all Debian/ubuntu systems, this is what you do, tune it yourself (unless the latest release changed that for them). AFAIK, only Truenas has decided to see if openzfs can manage itself without limit (well, ram-1GB).

This guy shows that the issue is actually broader than poor performance and can be generalized to excessive swap on boot disk use, and that’s even with swappiness set to 1

Along with that, when you begin to include excess boot disk load, this guy found that his m.2 boot disk was seeing so much excess load (with dragon fish beta as it turns out) that he had to put a heatsink on it!

And this guy found swap perf so bad under dragonfish he disabled it.

1 Like

I understand. This was in response to the IX guy who wondered about why stuff was on swap. Swap is a good thing (see other thread) actually. When it’s bad is when things are swapping in and out a bunch, and as you noted that can be real bad. Swap can save you from OOM errors with ZFS for example, search for OOM on openzfs github and you’ll see. If you follow the reddit zfs sub, you’ll see examples there also that come up. I do not want OOM, that’s fatal. I will always keep swap thusly.

You can set it to 0 (which does not mean never swap but close), I’ve been adjusting that for 10 years and it does make a difference on Debian/ubuntu. My comments were not in relation to was that a solution to this issue, it was to address the why.

Your example of Kris was not a guy with performance “so bad”. He actually said the opposite elsewhere, perhaps reddit, he had indicated it wasn’t an issue for him but he wondered why.

I think this is a saner solution than outright disabling swap. The last thing someone wants their NAS server to do is crash or have applications fail due to OOM.

1 Like

Perhaps my tongue was not firmly in my cheek enough.

The 2nd guy was me btw.

1 Like

But setting it zero does disable (according to what I found) causing OOM.

And setting it to 1 doesn’t work.

1 Like

Why can’t we just have nice things? :cry:

2 Likes

That what I was trying to say (eliminating all swap space typically bad idea) but as always never communicate terribly well. I know people from freeBSD think of swap differently, and for good reason! It might be ok until they find the issues for this webui/truenas issue as a mitigation thing, I probably would 0 swappiness (or maybe swapoff) (now that I know they changed it) if it solved my issues until fixed (but use a smaller arc as I don’t like risk). I would re-enable once fixed though. Definitely wouldn’t eliminate the paritions. But hey, doesn’t happen as I never load the x.0 releases anyway. I see they changed what 0 swappiness means now! I’m too old. Everything always ends up “in the old days…”.

Setting to 0 should not eliminate all OOM as you could actually be OOM with no swap. Maybe it solved some issues though. I just don’t want a system crashing due to OOM is the way I think of it. That’s bad, can corrupt even.

Setting to 1 SHOULD reduce any use of swap space (unless you are undersized on memory of course), less likely to have any used swap space. But probably not terribly useful for this particular problem. I always set swappiness less than 60 in the old days on Debian and ubuntu.

But at least I learned something new today! Can’t teach an old dog new tricks huh? Here’s to hoping IX can find the answer as to what actually (whether 1 thing or several things) so all can go back to normal for those who updated and have the problem. :crossed_fingers:

2 Likes

What does everyone think of this?

I use zram on my Raspberry PI 5 desktop. I have 32GB zram and 8GB ram and it’s fast (an advantage of not using an overpowered machine is for debugging or seeing how well something actually performs). No idea how it interacts with ZFS, never tried it. I have been impressed with zram. I guess for low memory machines it might be a loss and it’s taking away some ram. For larger memory machines it’s percentage wise a tiny amount so who cares. Not sure about in the middle but it’s an interesting idea. It’s for sure FAST.

But if nothing ever gets “swapped”, the zram remains at 0% size. It only dynamically grows/shrinks based on swapping in and out. (As far as I understand.)

Yes, I believe that is correct. I’m just saying if you do need to use the swap, it might be worse for low memory machines (as far as this problem). Not sure, didn’t ponder it much. But that is how zram works.

Wow, learned 2 things in one day. I just looked and archwiki says one of the most common uses is as swap, I never used it as swap, I use it more for in memory storage like caches and the like. I have to quit learning today, I can only take so much any more. :older_adult:

1 Like

Great, we have a solution. Now, for newbies, how do you proceed to set swappiness to 1 exactly

can you link the source where setting swappiness to 0 causes OOM? From what I found swappiness to 0 just means unless absolutely necessary and system is running out of physical RAM it won’t swap, it should still start swapping if system deems physical memory is exhuasted.