(Scale) major issues with VM ZVOL restoring (HOW?) from one Truenas to another

Hey guys,
I use Truenas to host a VM that runs HAOS for Home assistant, since the container did not provide full functionality (such as addons).

It is is running of a separate RAID1 SSD-Volume with the ZVOL for its disk named HAOS.
This gets a snapshot every night and a replication follows onto another Truenas in my house. This second Truenas has just a single large RAID1 with HDD’s, getting all my files transfered, including docs, pictures, movies (all SMB-shares) AND the HAOS ZVOL.

Yesterday I found the HAOS VM having major problems after some integration-update.
So I was looking to take the replicated version on my second Truenas and somehow replace it in the main one. The I could have kept going from the backup-moment.

But I actually have not the slightest idea, how to do that. It is not a file I can copy, it has no options - other than taking a snapshot from the webinterface…:man_shrugging:

So - how can I restore the back-up version into my main Truenas and restart my VM from there. I used to use VM-ware years ago, where disks have been files that could be copied easily. But here - I absolutely helpless.

I managed to use another backup from within Home assistant, but want to know how to do it.

If you use replications, you have snapshots… do you still have the snapshot?

I demonstrate using clones to test a snapshot rollback and then a rollback on a vm zvol in this video

3 Likes

That video is really great! I appreciate it!
Yes - but I currently do them only every night at 0:00.
I will probably do it more or less the way you show!
Great - thanks so much.

One other stupid question:
Ho do I get the data back from the other machine - lets assume the SSD fails that runs HAOS (even since it is a RAID1 SSD array with different wear levels).

Especially the ZVOL - files are simple, I can just copy them over SMB or whatever - but how do I get the ZVOL back to my main machine?

And also some important question:
The ZVOL I’m pushing to my replication machine is basically a fully functional system.
How can I set the other system-up so that my nightly replication task will enable me, in case of my primary system fails to just “switch-on” the VM on the backup?

I mean, its just HAOS - nothing powerful, but mission-critical for me.

You perform a reverse replication.

Then you remove the readonly flag. If necessary setup a vm to use the zvol.

To run the vm locally on the backup server, again, remove the readonly flag and setup a vm to use the zvol

The VM configuration itself is not backed up, but the disk is. That is the zvol.

1 Like

I know I have to reinstate the VM - but that is easy, on the one hand, the hardware is a bit different, on the other hand, the ZVOL is not perfectly in the same spot.

But I actually failed with the read-only. Can you help me a little bit on this.
And also - I think I read that it needs to be read-only for the replication to work.
But is there a way to have it ready when needed (like main machine fails, logging into the configured backup and click “start VM”)?

The issue is that the replication would overwrite the zvol if it were run again. Hence the readonly flag

Btw. If this happens while a vm is running it’s bad :wink:

You set the readonly flag in the dataset properties advanced view.

That was pretty easy. I thought I searched for the setting and somehow got an error-message. But seems to work. I’ll give it a try to add the ZVOL to the VM, set it back to “read only” and only disable in case of failure.

Then I have to ask you guys - because you seem to know perfectly what you are doing another point about docker-apps (one in particular):

I got also a similar problem: I’m currently running NGINX Proxy Manager on a SBC within docker. But since that device is more or less outdated, I’d like to put all of that stuff onto my Truenas.

But also NPM is mission critical for me and I have a (little less powerful) backup-truenas.
How can I just get this docker-app going / working from there?
Let’s say I got a massive hardware failure on the main system, then I could easily redirect all traffic to the backup-truenas from my UDM Pro.

But the data within the docker-container is “lost” at least as far as I understand.
How can I preserve that? and get it going?

Depends exactly how you setup the docker container. Bind mounts or volumes? Docker compose or docker cli?

Anyway, you can regularly rsync the data to your TrueNAS.

I’d suggest reconfiguring your docker to use docker compose and host mounts if you are not already.

The. The same docker compose can easily be run in TrueNAS… more or less.

1 Like

Will check into that tomorrow!

Since I’m around some obviously real experts for truenas:
I want to get rid of a 90gb snapshot taken in October, when the system performed an update.

It is especially annoying, since it is on my SSD-volume that is only 500gb.

But when I try this I get an error:

sudo zfs promote SSD-Pool/ix-apps/app_mounts/nextcloud/data@ix-applications-backup-system-update-
cannot open 'SSD-Pool/ix-apps/app_mounts/nextcloud/data@ix-applications-backup-system-update-': snapshot delimiter '@' is not expected here

Can you guys help one final time?
Otherwise I just open another topic, but maybe I can not fill-up this forum with just my questions over and over again.

probably for the best :wink:

I don’t have a problem with that

zfs promote works on datasets, not snapshots.

But maybe you should try doing the promoting/cloning/snapping with the GUI.

One more thing…(maybe I’m annoying - so sorry ;-))

I’m using some of the containers with partial IX-Volumes.
Is there any way to recover the data from them and change them into a reasonable path that I could also reference with my backup-machine?

Okay - two more things:

Is there any good reason to keep the read-only-flag for the replicated data on?
It can be switched-off (ignore) in the advanced settings for the replication.

You generally want to keep replication in a functional state (assuming it’s a scheduled task). ZFS isn’t clustered. Replication goes one way.

1 Like

It is scheduled - at 0:00 every night.
But since I configured the Snapshots as per the video by Capt. Stux, I basically only need this backup if there is a notable hardware-failure on my main machine.

But then - until the main machine is fixed / repaired / replaced, I would just like to use it normally. All the time apart from that it is just sitting there with no one harming its non-read-only data.

The readonly flag helps to prevent you accidentally using the dataset, and then having a periodic replication blowaway any changes.

Also if you turn off the readonly flag, it will prevent a properly configured replication task from blowing away your changes.

So, yes, if your primary fails, you can put the backup into service. The question becomes, is it now the primary? And the new/fixed machine the backup?

1 Like