Nfs session kernel hung

Hi we have a production server updatet to the latest 24version. We have updatet the server from Bluefin to Cobia and from Cobia to Dragonfish. All update without issues.
After a 24 we run in a big issue with the nfs server on the truenas.
The hardware is a HP enterprise microserver g10 with 16GB ram.
So all vm on the proxmox system becomes offline because the nfs connection to the server.
Now we have the server restartet again and the server could not stop the nfsd process, so we make a hardreset. After this we have startet the truenas with the latest kernelversion from the 23release. Maybe this will fix the isse temporarely.

Jul 15 06:26:03 truenas kernel: Call Trace:
Jul 15 06:26:03 truenas kernel:
Jul 15 06:26:03 truenas kernel: __schedule+0x349/0x950
Jul 15 06:26:03 truenas kernel: schedule+0x5b/0xa0
Jul 15 06:26:03 truenas kernel: schedule_timeout+0x151/0x160
Jul 15 06:26:03 truenas kernel: wait_for_completion+0x86/0x170
Jul 15 06:26:03 truenas kernel: __flush_workqueue+0x144/0x440
Jul 15 06:26:03 truenas kernel: ? __queue_work+0x1bd/0x410
Jul 15 06:26:03 truenas kernel: nfsd4_destroy_session+0x1ce/0x2b0 [nfsd]
Jul 15 06:26:03 truenas kernel: nfsd4_proc_compound+0x359/0x680 [nfsd]
Jul 15 06:26:03 truenas kernel: nfsd_dispatch+0xf1/0x200 [nfsd]
Jul 15 06:26:03 truenas kernel: ? __pfx_nfsd+0x10/0x10 [nfsd]
Jul 15 06:26:03 truenas kernel: svc_process_common+0x2f8/0x6f0 [sunrpc]
Jul 15 06:26:03 truenas kernel: ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
Jul 15 06:26:03 truenas kernel: ? __pfx_nfsd+0x10/0x10 [nfsd]
Jul 15 06:26:03 truenas kernel: svc_process+0x131/0x180 [sunrpc]
Jul 15 06:26:03 truenas kernel: nfsd+0x84/0xd0 [nfsd]
Jul 15 06:26:03 truenas kernel: kthread+0xe8/0x120
Jul 15 06:26:03 truenas kernel: ? __pfx_kthread+0x10/0x10
Jul 15 06:26:03 truenas kernel: ret_from_fork+0x34/0x50
Jul 15 06:26:03 truenas kernel: ? __pfx_kthread+0x10/0x10
Jul 15 06:26:04 truenas kernel: ret_from_fork_asm+0x1b/0x30
Jul 15 06:26:04 truenas kernel:
Jul 15 06:28:05 truenas kernel: task:nfsd state:D stack:0 pid:3138 ppid:2 flags:0x00004000

We could use more details on your server setup.
You mentioned Proxmox and VMs. Is TrueNAS and Proxmox two different physical servers?
Please describe your pool and disk setup on TrueNAS. You can expand the section in my signature for an example of what we would be looking for.

The problem TrueNAS version is TrueNAS-SCALE-24.04.2?

TrueNAS-SCALE-24.04.2
HP enterprise microserver g10 with 23GB ram
/mnt/pool1 = 2TB raid1 nonssd (WD) (boot)
/mnt/pool2 = 2TB raid2 nonssd (WD) (nfs-prox)
2x1GB ethernet
there are no errors on the disks and in the zfs volumes
pool2 is for proxmox
connection to proxmox is nfsv4

All proxmox are physical bare metal servers.

The nonssd are not the ideal solution we are planing to make a new server with ssd for proxmox.

Has this problem occurred more than once on TrueNAS Scale 24.04.2? You can try to file a bug report with iX Systems / TrueNAS.
https://ixsystems.atlassian.net/

Your description looks like one error with NFS and you switched back to previous version of Scale.

it was 2 times the issue occurred, after this i have startet the server with the latest kernel from the version 23.
So i am not switched back to a other version, no downgrade, only startup with a other kernel.

Seeing if any, more experienced users post. I’m not sure what to recommend trying

Do you have current backups of your TrueNAS configuration? I am not sure if you need a configuration backup of TrueNAS Scale 24.04.2 and the previous Scale version. A configuration backup can be used for a clean TrueNAS installation with all your current settings. That is only if necessary and recommended by by someone more experienced.
Do you have current backups of your pool datasets? nfs-prox as you listed it.

i have config backups, and yes i will also make a pool backup to a extern harddisk.
but now the server is running with the kernel from the 23version, maybe nfs goes not down. proxmox is connectet now and all vm’s are working.

https://ixsystems.atlassian.net/browse/NAS-130058?focusedCommentId=267888

A temporary work-around is to use NFSv3 (i.e. disable NFSv4 on the NFS configuration page). On the UI: System->Services->select the edit icon for NFS.