Every now and then, rather randomly, my NFS shares get screwed up and something locks up on both ends of the share – the TrueNAS box and whatever client is using it. On the client I can see RPC: Could not send backchannel reply error: -110
plastered all over the syslog and on TrueNAS, I am seeing hung task errors. This seems to coincide with moments of heavy bandwidth usage.
I have to reboot TrueNAS to get things going again as it’s not even able to stop the NFS share service via the web-UI. This is rather obviously not a particularly pleasant solution, especially since the box just gets stuck waiting for processes to end for 5+ minutes.
Googling doesn’t reveal much, but it seems there was a kernel patch merged for this recently and kernel version 6.9.8 seems to have the fix in.
My question is, do the TrueNAS devs ever backport these kinds of fixes to older kernels? Can I expect either a new kernel or is there anything else I can do to try and prevent this error from occurring? This kinda makes the whole point of using TrueNAS in the first place for my shares useless as I, rather obviously, need them to be reliable.