Dragonfish 24.04.1.1 in a VM, hangs on boot, can stall on shutdown

I am testing on TrueNAS 23.10.2 (128GB RAM) and TrueNAS 24.04.1.1 (32GB RAM)

Those are the hypervisors. (ie Linux KVM)

The video was record on 23.10.2

Nevermind. Had to watch a video to get the important details.

Will test with raw files as all my other VMs.

1 Like

A clean install of TruenNAS will hang when running in a VM with 8GB of RAM. It works fine on all previous versions of TrueNAS (Scale, Core and FreeNAS)), including 24.04.0

In the video on a TrueNAS 23.10.2 system with 128GB of RAM and a Xeon E5-2699A v4 CPU, I setup a VM with 8GB of RAM, 4 cores, a VirtIO NIC on a bridged interface, and a VirtIO 32GB boot zvol.

I then clean install 24.04.1.1 from verified ISO, and reboot it 6 times. It hangs on startup 3 out 6 times.

I then clean install 24.04.0, and reboot it 5 times, it starts correctly 5x in a row.

The ISOs are shasum OK. I have also confirmed this behaviour on a Dragonfish 24.04.1.1 system running on a Xeon D-1541 with 32GB of RAM.

Which details did I miss?

This is why I recorded the video, to catch any details I missed.

Ok, I can confirm that TrueNAS-SCALE-24.04.1.1.iso would install under TrueNAS-SCALE-24.04.1.1 as a VM (8GB RAM, 4 core, 10GB raw disk, 1 virtio NIC, 1 Display), but in my quick testing, it hangs/freezes during boot, right after 6-9 seconds, so I can assume that this can be an issue with another version.

(Boot never went past:
Either:
middlewared: setting up plugins

or

Configure swap filesystem on boot pool.
Generating DH parameters, 2048 bit long safe prime

)

1 Like

Wait!

I did get TrueNAS-SCALE-24.04.1.1 to boot under TrueNAS-SCALE-24.04.1.1 by
changing the RAM from 8 GB to 10 GB.

After that, changed it to 9 GB and then down to 8 GB.

It booted OK a few times with 8 GB RAM, but it did freeze while booting (as above)
when I tried for the 10th, 12th time.

Oh wait, it froze after a few boots, with 9 GB RAM, also.

Not trying anymore with 10GB, there is definitely something going on with this very specific combination of hypervisor under hypervisor.

1 Like

Meanwhile, 24.04.0 will boot fine.

Thanks for confirming that’s it’s not just my system.

New 24.04.2 nightly with an updated 6.6.32 kernel.

I just tested the new nightly I was able to boot 6x in a row without hanging.

:champagne:

Will now test upgrading my demo vms (the ones I use for making videos…)

Nice, I may get to test it in my secondary server.

FYI. I did a reboot today under 24.04.1.1 and it took a looong time (8-9 min in some zfs step that never took more than 1-2 min.

Sorry, was not expecting it, so did not document it, I am sure others have seen it and already reported it.

check how many snapshots you have…

Thanks, yes, I heard the platter HD trashing, so I knew it was data related.

Funny thing is that I actually purged thousands of snapshots a few days before the reboot (i.e. system has had similar snapshot number before, and never took that long to boot?).

Hmmm 10k gives us a warning. I have ran several systems with 15-25k snapshots without the boot delay I mentioned.

(ok, platter HD has 27k + 9k snapshots, more than I usually keep, but the exact same “Data Protection” tasks in the last 6 months?, so, boot times go double when you have more than 25k snapshots?).

LOCAL_HDD_1_1a@LOCAL_NVME_1_1a-hourly-snap-task- = 27036
LOCAL_HDD_1_1a @LOCAL_NVME_2_1a-hourly-snap-task- = 9865

I know this is not your dog (zfs), but at some point, we all need clear limits, besides the ‘unlimited’ promises, right?

its a bug

Ah!, glad I did more than “check how many snapshots you have…”
Guess I know my os and systems ;-).

Not good here:

First boot with the nightly (did the update with the update file in the post).
(Dragonfish-24.04.2-MASTER-20240607-013916)

With 9 GB memory for TN VM, system hung at:
Starting systemd-modules-load.service - Load Kernel Modules

With 8 GB memory for TN VM, system hung at:
systemd[1]: memfd_create() called without MFD_EXEC or MFD_NOEXEC_SEL set

Wait, was I supposed to use the nightly for the host or use the nightly for the VM?
(or both?).

I updated the host but used the 24.04.1.1 in the VM.

the issue is the guest freezing at a random point during startup (or not)

updating the kernel in the guest fixes the issue.

Using the nightly to upgrade my 24.04.0 demo truenas vm seems to work.

1 Like

Made a video repro and filed a bug

https://ixsystems.atlassian.net/browse/NAS-129481

I can confirm that the nightly above as the guest VM, does boot properly.

Tested with both 24.04.1.1 and the nightly as the hypervisor host.

1 Like

it is the built in Intel I219LM in the HP ProDesk 600 G3 Desktop Mini

(sorry for the late response, I was away and could no open this site with my old Ipad)

TrueNAS-24.04.1.1 ( Kernel 6.6.29) fails to start on CPU E5-2696 v3 as VM on Proxmox 8.2, VM options: CPU - host, 8 Core, 32 GB RAM
I guess it is related to the kernel
TrueNAS-24.04-RC.1 (Kernel 6.6.19) works

And nightly build works - Kernel 6.6.32

1 Like