Things that you cannot do with VMs under TrueNAS SCALE

I feel like that meme guy with the coffee mug where I have a little sign that says “Things you can’t do with VMs. Change my mind.”

  • Off-site Backup. I have regular snapshots of the zvols backing my VMs automatically happening, aging out, and so on. Great. I cannot find a way to replicate this zvol off-host/off-site. Cloud sync tasks and rsync tasks want me to specify a directory, and the zvols for VMs don’t appear in the list.
  • Restore. Recovering a VM back to a specific snapshot of its zvol is an exercise left to the reader. Dog help you if you ask the documentation’s stupid AI assistant how to restore a VM from a zvol snapshot.
  • Export. You cannot make an exportable version of a VM (e.g., VMDK or similar format) so that you can move a backup of it off-host (potentially off-site) for safe keeping.
  • Import. Since you can’t export, you can’t import.
  • Migrate. Since you cant export/import, you can’t migrate a VM from one TrueNAS SCALE host to another. You definitely can’t migrate it to any other virtualization platform. You can move the data, perhaps, but not the VM’s configuration.

I’m running VMs on ElectricEel-24.10.2.1. It looks like support for VMs on SCALE will get worse before it gets better. The release notes for 25.04 (Fangtooth) say:

Users with production VMs on TrueNAS 24.10 should not upgrade to TrueNAS 25.04 until after this experimental feature stabilizes in a future TrueNAS release.

I’m a bit of a rookie on TrueNAS Scale. But one of the things I’ve learned about the internet is that the trick to getting the right answer from the Internet is not asking the right question. It’s posting the wrong answer. So I post my understanding publicly and then find out what I got wrong.

1 Like

Yes, you got it all correct, you are knowledge master. :joy: Poking a bit of fun.

I do not use VMs inside SCALE so I could not tell you if you are right, wrong, in the ballpark, or anything.

I believe this is strictly for LXC which is experimental, but I could be wrong.

Maybe what they really mean is “Users with production VMs should not convert them to LXC containers” or something like that? But it sure reads to me like “people with VMs should not upgrade at all.”

2 Likes

I don’t use the built in replication system, so can’t comment there, but, I am using zfs-autobackup, so for point 1 (off-site backup), it works fine. No issues. No issue restoring either.

For exporting, that’s what qemu-img is for. I have used it to import and export disk images. I converted a Mac to a VM for example and also took in a Win 10 VM from another system.

You can export the VMs config as well. That’s what virsh is for.

So, I don’t know if any of those can be done from the UI, but they can from command line. Truenas uses qemu is Eel. So, the doc or info you need is found with qemu related manual or info.

For Truenas 25.04 VMs/Instances, the emphasis is on “experimental.”

It’s not about “converting”. Fangtooth now uses Incus as backend for both system containers (formerly systemd-mspawn/jailmaker) and full VMs (formerly direct KVM). Zvols will carry over but VM configuration should be redone.
The advice really is that people who rely on VMs for production should be very careful about upgrading until the Incus subsystem is better integrated and the GUI has stabilised.

2 Likes

Not much of a challenge.

Place all zvols into a common top level dataset, recursively snapshot and replicate that. You should hierarchically organise your data in some way, anyway.

If you need e.g. different retention policies or schedules, use a dataset per policy and put all zvols sharing that policy there.

zfs rollback pool/path/to/zvol@snapshot

Also available from the UI.

zfs send pool/path/to/zvol@snapshot | gzip -c >/mnt/path/to/some/dataset/vm-snapshot-date.gz
gzip -dc /mnt/path/to/some/dataset/vm-snapshot-date.gz | zfs receive pool/path/to/some/zvol

Export like outlined, copy to second machine, import.

Or without an intermediate copy:

zfs send pool/path/to/zvol@snapshot | gzip -c | ssh secondmachine "gzip -dc | zfs receive pool/path/to/some/zvol"

Every single task that “cannot be done” according to you is a simple command line invocation or (snapshots, replication, rollback) even available in the UI.

The combination of command line tools like zfs send, gzip, ssh for a single task is Unix 101.

2 Likes

There is a lot to unpack in the excellent answers already here, but here is my take.

  1. OP was correct in expecting his assumptions to be debunked.

  2. OP 's use of terminology (import/export) may be confusing i.e. I believe he was really talking about backing up and restoring or moving zVols from one location to another and not ZFS pool importing and exporting (which might be assumed from the terminology).

  3. OP is really saying that these capabilities (replicating zVols, backing and up and restoring zVols) should be available in the UI - and if they aren’t (I don’t use VMs so I have no idea) then they probably should be. (But to be fair to iX, there is always more functionality that can be added, and this is one example.)

  4. However, if you are doing technical stuff like VMs and especially if you are doing Linux VMs, then you probably need to be technical enough to do command line / simple scripting to achieve these sorts of things.

1 Like

Full success! :trophy:

1 Like

Will Incus be ready for primetime (release) in a few weeks?

It’s already there, but the exposed funtionality and GUI around it is not expected to stabilise before 25.10 Goldeye.

1 Like

This is why I post stuff like this. This was super helpful.

This set of solutions conflates the contents of a virtual hard disk (the zvol) with the entirety of the virtual machine. Or do I misunderstand? The VM configuration itself is not backed up or stored in the zvol, is it? Doing zfs send and zfs receive only gets the virtual disk data from point A to point B.

The VM itself is more than just the contents of its virtual hard disk. If I have a VM that has 2 virtual hard disks (one boot, one data), it’s up to me to know which zvol is which, and—during a restore—to rebuild a replacement VM the right way. A VM has a lot of metadata: the number of CPUs assigned (and choices on cores, threads, etc.), the amount of RAM assigned, USB device assignments, etc. All these are important to backup, and necessary to recreate if I do a restore. That’s an exercise left to me. My VMs get a virtual network interface with a virtual MAC address. DHCP assigns an IP address based on that MAC address. If I want the restored VM to get the same IP as the original, I need to reuse that MAC address or go change my DHCP server. Is any of this captured in the zvol that is being sent around? I suppose it could be, but that would surprise me.

I’m not asking how to backup one virtual hard disk. I’m asking how to backup a virtual machine—the whole thing.

At the very least, capturing the state of a VM for backup is a set of unrelated commands that have to be orchestrated and run at the same time. Like, I need to write some qemu commands that capture the VM configuration somehow, and I need to store that somewhere safe. And I need to manage the zvol snapshots/replication, etc. And these are disconnected. It’s up to me to make sure my VM’s metadata and structure are captured at the same time the volume is captured and keep track of which zvols and which VM configs belong to each other. One could argue that this VM metadata changes very rarely. That’s true. But I either make a manual task for myself to back it up during those rare times that it does change, or I just automate backing it up so I never have to think about it. Normally I’d just automate the backups.

In that case you are right. TrueNAS does not do that. At all.

You need to run Proxmox or VMware or similar if you need support for that.

We run VMs in production on TN CORE and manual creation of a VM from a disk image is good enough for us. We are happy with the cost and performance and most of all the reliability.

Kind regards,
Patrick

This is a thread about virtual machines, not zvols.

I’m not asking for improved zvol management. I’m saying that zvol management is not the same thing as virtual machine management. Zvol backup is not the same thing as VM backup. To restore a virtual machine or to migrate a virtual machine from one host to another, I need more than just the contents of the virtual hard drive. Consider that the GUI for VM management in ElectricEel has a ‘clone’ button. Whatever that clone button does is close to what I’m looking for. It does “grab all the attributes and use them right now to create an identical VM.” I want: “grab all the attributes and store them in standardized format somewhere that I can use later.” And schedule that capture regularly (or capture it every time the VM’s attributes change). And somehow let me know which captured VM attributes correspond to a particular zvol snapshot, so I can restore or migrate them together as a unit.

When I make a VM based on a backup, I need to be able to handle things like the MAC address that was captured in the backup. Do I reuse it or not when I restore? Depending on the situation, the original VM might still be online when I restore, so I don’t want the restored VM to launch with the same MAC address. But maybe the original VM is gone, so I do want to create a VM with the same MAC address. A restore/migration process would need options and support for that.

It’s not that I can’t work at the command line or do some scripting. It’s that the commands for capturing the VM attributes are undocumented. There’s no support for scheduling regular snapshots of VM attributes beyond writing my own script, determining my own format, and putting it in crontab(5). There’s no packaging of VM attributes (e.g., a JSON, YAML, or XML file) that can be correlated with the corresponding zvol snapshot. There’s no documented or automated method to create a new VM with the same attributes that were stored in a backup.

Focusing on VMs, not zvols, did I miss anything?

1 Like

No you didn’t. I missed that in your first post because I have become so used to only dealing with the underlying storage, it did not occur to me what you actually meant.

1 Like

Considering the amount of work that is being done to improve and stabilize the new Incus-backed Instances feature in 25.04 and looking toward 25.10, now would be a good time to create a Feature Request for the VM config backup features you want to have available (or add your support if one already exists).

4 Likes

So, I actually read your request right for once! To say commands are undocumented is unfair, they are documented. Truenas uses lots of underlying software, and in no way can they possibly document all of those. They are documented on all of the underlying software sites. I agree, it can be made easier, make a feature request as noted. But they CAN be done. Since I have used Qemu for ages, for me, it would be trivial to recover any VM lost. Which it’s never lost if you download a copy of your Scale config file every so often. But yes, it’s not trivial for many others.

The VM config does not change as the machine operates during the day. It’s pretty static.

So, I gave you a head start, and, I was merely responding to the title of the post. If it had said “Things you cannot do in the UI…”, I would agree.

Anyway, submit said feature request, it’s a good one I think.

This all changes with 25.04 Fangtooth.

New VMs are “instances” and can be imported, exported, backed up, restored, copied, cloned, snapshotted, migrated etc.

EXCEPT almost none of that is currently exposed in the GUI, but it can be done with the Incus CLI.

I expect much of that GUI functionality will come in future versions of TN.

1 Like

I so wish I had thought of this a long time ago. To this day I get frustrated by search results almost always returning results unrelated to my questions.

I’m not trying to be quarrelsome, but I can make a pretty strong case for these commands being undocumented.

If you look at the online documentation, especially the page on Virtualization, you literally won’t find the string “qemu”. I went to github, checked out the 24.10 branch of the truenas/documentation repo, and recursively grepped for the string “qemu”. It shows up exactly once in the repo, and it’s just a substring. I checked out the release/24.10.1 branch of the truenas/webui repo and grepped for the string qemu. It doesn’t appear even once in the web UI (if I’m looking at the right source code). Not as a tooltip, not in the content of a page, nowhere. As a user of TrueNAS SCALE, I have been running VMs for years without knowing that qemu was the name of the subsystem it was using under the hood for virtual machines, and without any pointer to the CLI commands I could use to interact with VMs outside of the GUI.

Even the big warning on the 25.04 documentation that says “TrueNAS 25.04 replaces the previous KVM hypervisor…” does not name the KVM hypervisor subsystem. It does not say, for example, “replaces the previous KVM hypervisor (qemu)…” Contrast this with the upcoming change to “instances.” The documentation and the 25.04 branch of the web UI source code make it clear that if I want to understand “instances”, I will need to reference LXC and Incus.

I learned a ton in this thread where people who know how it works actually said how it works. But as Dr Evil would say, “throw me a frickin bone, people.” I don’t think it’s too much to ask for something in the documentation or the web UI like “Virtual machines are provided by qemu, so see [this site] for more information on managing virtual machines.” You could reply “it’s open source, why don’t you contribute that to the docs repo?” and I would say “because I obviously don’t know how any of this stuff works, I couldn’t possibly contribute documentation on it.”

I’m not scared of the command line and I’m not scared of reading other documentation. I suspect some of these details are so well known to folks that have been using TrueNAS for a long time, that it isn’t obvious what the documentation lacks.