TrueNAS Scale Resource consumption suddenly spiking for days until restart

Hello everybody,

I’m new to Scale and have been using it since the beginning of the year. I recently noticed high / spiky power consumption in my power meter app, thought could easily fix it once I restarted scale. After restarting it went down again and consumption stayed minimal as expected.

I’m now trying to get on the bottom of this, since it happened again 2 days ago and I only noticed because I decided to check my power consumption again. Otherwise it would have probably continued on forever. This started on Thursday May 16. 2024 at about 22:00 and I restarted on Sunday May 19. 2024 01:00

Since I updated to Dragonfish Stable lately (which might be part of the issue) I’ll attach a netdata snapshot of the time period as it might provide some useful insights.

I was able to narrow it down to what I believe to be some form of internal process that just spikes and never comes down until you do a system restart as well as my SMB-share staying active. Since that would be kinda scary I looked at network traffic but that stayed minimal, hence I suspect it to be something internally.

There also don’t seem to be any significant writes to storage so this issue seems to be entirely in memory. The spike pattern only really appears in CPU-Processes, Temp and ZFS ARC suddenly getting a lot of hits. There were no active scrub, smart or snapshot task, those run on sundays at 00:00.

While I can grasp some of what ZFS is doing, I really haven’t had the time to dig into how it actually works and my issue is that it’s really hard to figure out what exactly is happening. You probably could if you new all the commands you could use in Scale Shell but sadly I don’t and finding documentation on what exactly I can do to trouble shoot isn’t as easy as you would hope.


32GB DDR4 3200MHz CL16
128GB OS Drive Mirrored
1TB Highspeed SSD Storage Mirrored
2TB SATA SSD Storage Mirrored

Thanks for the help and explanations in advance :slight_smile:

This issue might be related, as it fits my bias towards some weird stuff happening with processes, but might not be at all.

Disable lru_gen and see if your problems go away.

Will try that, thanks.

Since it’s not really that easy to reproduce (I have a theory though) I’ll update if anything changed in like 1-2 days or if it’s returned by then.

So far I haven’t been able to observe the issue again. Though that might still be because it needs more time to reproduce as it’s hard to pinpoint. Will update again if anything changes.

1 Like