TrueNAS CORE suddenly becomes unresponsive with 100% CPU of middlewared

martin.g · December 8, 2025, 3:36pm

Hi,

I don´t know whats the reason, but currently my TrueNAS instance is no longer reachable with 100% CPU load (neither by console nor by ssh) after some time. It ran fine for years now. Recently I did an upgrade of TrueNAS CORE to TrueNAS-13.0-U6.8 and moved it from ESXi 7.x to Proxmox 9.1.2.

Sorry, screenshots are not allowed for me, so I have to tell:

Screenshot of htop during increased load until disconnect shows that it´s “python3.9: middlewared” that causes 100% CPU load on all cores.

In console I can see a failed login of a no longer existing domain user, but I don´t think it has to do with the problem.

In Proxmox it has 102.25% on 4 CPUs and CPU usage stays at 100% for hours (until reboot).

Guest agent is usually running but connection gets lost during high CPU load. I just incread from 4 to 6 cores.

Host is idling with 2x Xeon E5-2680v4 (56 threads).

The only thing I can do is hard reset of the VM.

Any hints on this?

Regards
Martin

martin.g · December 8, 2025, 3:50pm

So, now I can upload my screenshots:

upgraded to 6 cores:

host is idling:

martin.g · December 8, 2025, 4:07pm

Short after reboot, SSH still working with unclear usage of CPU:

And console already stuck:

A few minutes later also per ssh non longer reachable:

Perhaps it has to do with snapshots? But they never were a problem with ESXi:

After disabling Snapshot jobs, it currently does SMART checks. High load of 3 cores, but I don´t know where it comes from:

Seems a litte like I/O problems, doesn´t it?

Okay and two minutes later it is now stuck again and on 270% CPU:

martin.g · December 8, 2025, 6:59pm

Upgraded Hardware to 16 cores and changed to host CPU as I will not do live migrations. I guess it could have to do with encrypted pools and my x86 qemu CPU.

Edit: Still getting stuck….

HoneyBadger · December 8, 2025, 9:00pm

Change to x86-64-v2-AES to see if this resolves the issue.

You also appear to be using an older emulated LSI card and not isolated disks so be aware that you may be at risk of a pool double-mount with Proxmox as it speaks ZFS, please see Virtual TrueNAS, Proxmox, and Preventing Double Imports with "zpool multihost"

BigFlubba · December 9, 2025, 4:32am

Just to add to HoneyBadger’s post, there is also a v3 option, but I am not aware of the differences off the top of my head. Iirc, it has less overhead. Also worth mentioning is that Proxmox displays “cores” as host CPU threads. So if you have a CPU installed on the host that is 4C & 8T, then you’ll need to pass 8 “cores” to the VM to give it access to all 4C on the host’s CPU.

HoneyBadger · December 9, 2025, 3:21pm

v3 gives you AVX2, v4 gives you AVX512 - but your host obviously has to support those instructions.

martin.g · December 9, 2025, 4:26pm

Hey there,

thank you for replies:

It was default and was so during the first time I had the problem. Later I switched it to CPU type “host”.

I passed them through as I did before with ESXi RDM disks. Many years ago I passed through whole controller, but I thought this is no longer necessary. The LSI Controller in Proxmox is just virtual and states how it´s presented to the VM. Are there better possibiltites for TrueNAS?

I am aware of the potential risks of damaging the file system, but I did not know the multihost parameter, thank you very much!

I am aware of that and I think it´s at least debian default (56 threads at /proc/cpuinfo).

According to cpuinfo my CPU supports avx2 but no avx512, so I think v3 would be fine. This also matches to doc as x86-64-v3 is compatible with Haswell+ and E5-2680 is a Haswell CPU: QEMU/KVM Virtual Machines
But as I mentioned I use “host CPU type now, so this should be best, isn´t it? I won´t use live migrations as I have disks passed through.

But I have no clue whats going on. I noticed that there´s a scrub running on truneas, so yesterday I detached my main 14TB mirror from truenas and imported it within Proxmox. And started scrub under Proxmox. At some time I started Truenas without those mirror. And after again some time load of host raised up to 65. I shutted down truenas again but load still raised, while scrubbing data rates were okay with 200mb/s. In the morning scrub finished without a single error byte. But load still growed up. Then I restarted the host, restarted the scrub and it ran fine with a load of 3-4.

As you may notice scrub lasted about 09:30 hours both times.

Don´t know, what to check next.

martin.g · December 9, 2025, 4:34pm

I changed to “VirtIO SCSI single”, exported the pool from truenas, reimported the pool in truenas:

And now I will try one more scrub within Truenas.

pmh · December 9, 2025, 4:56pm

What made you think that? Anything but PCIe pass through of an entire controller is strongly discouraged and prone to lead to loss of data. This has never changed and probably never will.

martin.g · December 9, 2025, 5:05pm

Okay, I will accept that. Besides the possibility of dataloss by a human mistake, are there any other disadvantages?

pmh · December 9, 2025, 5:14pm

Not human mistake. Not giving TrueNAS and ZFS direct hardware access to the drives has a high probability of shredding your pool. It’s the only known to work reliably configuration for virtualised TrueNAS.

martin.g · December 9, 2025, 5:18pm

I just noticed that I have no hardware access to the passed through devices as I thought. In ESX with RDM I could see SMART values and so on. And I do not with Proxmox….

Well, then I will have to overthink where to attach those two Proxmox disks so that I can passthrough the whole controller again. I have one more HBA for this server, but every new controller will take more power

Thank you for pointing out!

BigFlubba · December 9, 2025, 9:36pm

I see. Thanks! Makes sense why I couldn’t do v4 on one of my nodes.

Topic		Replies	Views
TrueNAS install on Proxmox with NVMe drives passthrough TrueNAS General CORE , Hardware , NVMe	13	3822	November 14, 2024
I need an adult. I really suck at this TrueNAS General SCALE , Hardware , ZFS	15	586	October 25, 2024
Degraded and smart errors TrueNAS General SCALE , Hardware	15	156	September 2, 2024
TrueNAS VM - Flapping SMB transfer speed between 1Gbit/s and zero TrueNAS General SCALE , SMB , macOS-Client , TrueNAS_as_VM	28	456	April 29, 2024
Need to wipe Instances config without enabling to avoid boot loop Apps and Virtualization SCALE , VM	9	187	July 10, 2025

TrueNAS CORE suddenly becomes unresponsive with 100% CPU of middlewared

Related topics