I happened to look at the display output of my TrueNAS server and there was a message from the kernel that tailscale had been OOM killed. In this case it is my fault because I had set a rather low memory limit in the apps menu, but if I hadn’t looked at the console I would not have noticed. I think OOM kills or even more generally any abnormal exit of a process in a container is something TrueNAS should tell me about so I can evaluate whether it is a problem and I need to take any action.
Impact
I think providing more info about error conditions is almost always a benefit. However it is possible that this will cause some spurious alerts (I don’t know specific cases, but perhaps there is an app out there that can use excess memory to improve performance in some way, but doesn’t require it, and relies on being OOM killed to learn that it should run in the lower-memory mode).
User Story
A user has installed an app with a memory limit. They get an alert that some process within the app was killed due to the limit being exceeded, and now they know that the limit should be raised or perhaps more memory installed in the server.
I would think, when the Available memory gets low, which is a thing, or when ARC gets low, or when the Memory Used gets too close to the total RAM available, regardless it is still a minor thing to calculate to find out if you are out of RAM. The harder part, what is going to be the defined limit for the alert to be triggered?
I did a free -h on my system just now (I had ssh open already)
free -h
total used free shared buff/cache available
Mem: 125Gi 97Gi 19Gi 47Mi 10Gi 28Gi
Swap: 0B 0B 0B
free in the table is totally free idle memory it is not a problem to have a low free memory and in Truenas that is normal, available memory is usually what you want to be concerned about. This system has no issues running anything and Truenas’s memory allocator handles everything well balancing as needed. On a different system with less total memory I have 12Gi available free. If apps use more it pulls from the cache allotment.
I would think it is up to the user to give the app the necessary memory or at least the minimum of 4096MB and go up if recommended by the particular app or from experience.
Memory can be over committed just like thin provisioning in storage can over commit storage space but if so it would be the responsibility to understand you over committed the memory. I don’t see how an alert threshold could be set. What do you use? both my systems above have quite a bit different free memory but neither has any issues. If the system is setup and functioning properly it should be fine. If the app crashed it is likely enough memory was not allocated to begin with for the app and it hit the limit that was set.