I set up my Truenas System about half a year ago and for a long time it ran without issues. The last 2 Months the system has gotten almost unusably slow after a few hours after a restart. The UI gets really slow (the apps screen loads fpr up to 3 minutes), the shares get quite unresponsive and the apps dont perform decently. The issue also occured in a 24.xx.x Version, but i dont remember which version exactly it was.
Im using the following apps: HomeAssistant, Faster Whisper, Eclipse Mosquito, olllama, piper and a staisfactory server. Having them all running doesnt even come close to using all of the RAM/CPU.
Your most likely problem will be your disks. Your Seagate Baracuda drives are SMR which is known to cause slowness, system instability and in more severe cases data loss.
It’s also recommended to run apps from a deditacted ssd pool not a hdd based one.
As in my comment, setting the vm-swappiness to 1 fixed it. I can write to the NAS sustained with 1Gbit/s. Although data loss is quite the severe issue. Im planning to upgrade to 3x20tb (propably a seagate exxos or similar) and that will propably fix the issue.
Generelly the OS runs on a 128GB SSD, would it be possible to move the apps on that drive or is that generally a bad idea?
Depends on how many apps you want to run, or if you also plan to maybe run a small vm (which also should run from ssds not hdds).
Since normal sata ssds suffice for basic app and vm use, i’d get a pair fo 240-512gb ssd and use a mirror for redundancy.
That makes sense. But the main issues i had actually went away. Usually after 12h of uptime the apps screen loads for at least 10sec, now its instant. But do you have any idea where i can start looking for alternative Problems?
You say that running your apps doesn’t use all your RAM, but I see some potentially memory hungry apps on your list.
What does the RAM situation look like when it’s been 12 hours and the system has become slow?
Btw, those Barracudas are likely SMR, so if anything ever attempts long writes on those you have a high risk of running into the SMR bottleneck, potentially faulting the drives/pool.
Typically it comes up when people are in a faulted state already and need to resilver. That extended activity definitely has the potential of running into the SMR limits. I don’t know enough about the issue to say if a 500GB write is likely to run into any problem.
Ive looked into the components needed for a sensible Upgrade, if its okay for you id like your opinion on them and whats most important:
3x 20tb Seagate Exxos X20 or X24 800-1000€
32GB additional ECC RAM 2x Kingston 16GB DDR4-2666 CL19 (KTD-PE426E/16G) - The Mainboard doesnt support ECC, but i bought those modules some time ago and want to match them
i know not having ECC is quite problematic, but retrofitting it would require a new Motherboard and a new case given i cant really find any ECC-Motherboards that are micro ATX with 244x210mm
L2arc won’t help you with that usecase (only if you would stream the same movies/sieries over and over again).
After some time using your nas drop to shell or an ssh session and check with arc_summary how your arc_hit_ratio is like. If it’s above 90% you do not need an l2arc.
Edit:
My usecase is very similar to yours. Mostly media streaming and i run arouond 30 apps. My arc hit rate is 99.99%. In the very beginning i tried adding an l2arc, but it literally did nothing for my usecase.