Hi All,
Looking for a little help to understand where I am off base. The other day I found my Truenas server completely locked up and requiring a hard reboot. I am running Core 13U2. I have two proxmox vms that mount a NFS shared location for the user homedir. After reviewing the logs, it looks like any io activity (even as small as the ls command) in the mounted share could cause a error in the /var/log/messages of the NAS. The error was “nfsrv_cache_session: no session ipaddr=192.168.1.103” and a second line just after but the ip was 104. If I did anything with heavy IO like copying large files I could cause the error to spew in the logs and eventually crashing truenas. Using DD I tested this a few times with the same results. If one of the VMs were off then no issues and no errors. Some details.
Both VMs were built the exact same but with different ips / fqdns / MACs and UUIDs. They were not proxmox cloned. After reading a bit I see that despite the FQDN and ip being different the hostid has to also be different. My current understanding of hostid is that it grabs the first ip in the etc/hosts file ( if no /etc/hostid file) and dumps that to hex… so in my case since they were both dhcped from my foreman they both had a hostid of hex “127.0.0.1”. So I could understand the this would cause errors based on truenas docs. But here is the question why does just changing the hostname resolve the issue. Where as each one was:
VM1: bastian01.dev.local.io (ip = 192.168.1.103)
VM2: bastian01.home.local.io (ip = 192.168.1.104)
How does changing the hostname only to “bastian01-dev” resolve the errors and crashing? I tried searching for the exact error a bit but not many hits.
Thank you in advance, and while the change i made resolved the issue, i would really like to understand where I am off in my assumptions.
*Side note - fixing the issue above also fixed some io stalls i was seeing on my proxmox nodes. So apparently my misconfiguration above was causing performance issues not only on the VMs that used the NFS shared space but also the other VMs that were hosted in proxmox using a different NFS share for backend VM storage.