BTRFS in Linux has something over ZFS

winnielinnie · July 12, 2024, 6:44pm

Click to read a short overview of "transactional updates"

Only if all updates (or other changes) could be applied successfully, then the system will switch into that new updated state. If any error occured - think about failed package post scripts or running out of disk space - the updated system will just be discarded again. All of this is happening in the background, i.e. the currently running system just continues to run all the time.

Or in a more formal way: A transactional update is an update that

is atomic

Either fully applied, or not applied at all

Update does not influence the running system

can be rolled back

A failed or incompatible update can be quickly discarded to restore the previous system condition

How does this work?

With the Snapper implementation, first a new snapshot of the system is created. Afterwards, this snapshot is changed from read-only to read-write and several special directories such as /dev, /sys and /proc are mounted. The proposed changes will be performed in that snapshot in a chroot environment. (…) If the update did succeed, then switch the snapshot to read-only (on ro systems) and make the subvolume the new default. On next boot, the system will boot the new snapshot. If the updated system should not boot, the system can simply be rolled back to the old snapshot.

This can be incorporated with Linux desktops and servers.

My question is, it sounds like this can theoretically be done in ZFS as well? I’m just trying to wrap my head around how you can apply updates to a “snapshot” without affecting the live filesystem and currently loaded modules? (Refer to the section in the above quote about temporarily making a snapshot “read-write”.)

To reiterate: The updates are applied to a “writable” snapshot in the background. Unlike what we’re familiar with, they are not applied to the live filesystem, of which a snapshot was taken before to allow you to “rollback”.

This is fundamentally different than using BTRFS snapshots pre-update, as a safety net to rollback after-the-fact.

“Transactional updates” don’t even touch the live filesystem. You can use your system indefinitely, regardless of what modules may or may not be currently loaded, regardless if they are out-of-tree modules.

ericloewe · July 12, 2024, 8:24pm

Sounds like taking a snapshot, cloning it, updating on the clone, and then switching over, promoting the clone once the update is validated to be working.

winnielinnie · July 12, 2024, 8:30pm

Which makes me think it’s theoretically possible to do with a Linux desktop/server that uses a ZFS root.

But I’m not sure if we’ll ever see such an implementation, considering that even running ZFS as the root filesystem (on a Linux distro) is not something a casual user can rely on when dealing with OS updates.

EDIT:

Interestingly, “clones” are not brought up with “transactional updates”. Which makes me wonder about the different terminologies between BTRFS and ZFS.

Arwen · July 12, 2024, 9:54pm

Uh, Solaris 11 has been doing that since 2011 using alternate boot environments. Basically patches are applied to a ZFS snapshot & clone, and if the update looks good, you reboot to the new boot environment.

Before I switched to OpenZFS for my Linux computer’s root FS, I had BTRFS on some. They too used snapshot, clone and patch. Though I did not bother with the “chroot” part. Just rebooted after cloning. Then I patched, (while doing something else). When done patching / updating, I reboot once more.

Today, I use OpenZFS on all 4 of my Linux computers, with alternate boot environments. Basically;

Snapshot
Clone
Grub update
Reboot to new ABE
Patch OS
Reboot to make patches effective.

I could do the “chroot” part, but it’s not worth it for me.

Now for an Enterprise server, doing a “chroot” & patch on a non-live file system, like what Solaris 11 does, would be very useful. Plus, limiting the reboot to a single one is also good.

Arwen · July 12, 2024, 10:01pm

Oh, to be clear, both the old BTRFS method I used, and the newer, (well, since 2014 but after I quit using BTRFS), OpenZFS method, allow rolling back an update by a simple reboot and using Grub to select the prior boot environment.

In fact, I just had to do that last week. It’s rare, because I have the Gentoo Linux updates working quite well. But, I did something that caused the fonts to look really ugly. Could not figure it out. So I backed out my update. Then I destroyed the clone & snapshot, making that new update go away.

winnielinnie · July 12, 2024, 10:16pm

But the “selling point” of these “transactional updates” is not that you can rollback (which is what many Linux distros offer already, either via BTRFS snapshots or rsync’d roots), but that they do not even touch the live filesystem. This is especially useful for out-of-tree modules, and to assure an uninterrupted workflow.

Case in point: Even with BTRFS “snapshots”, using Arch Linux causes issues with Nvidia, SMB, ZFS, the Plasma desktop, etc, when you’re using an “updated” system without rebooting.

Another selling point is that you can apply an indefinite number of updates before the next reboot, and your live system will still be unaffected, while your new “updated root” is ready to boot into the fully up-to-date system.

This also obsoletes the need for clever “hooks”, such as the kernel modules hook that is used on Arch-based systems, otherwise the running kernel’s modules are outright removed from the filesystem.

Arwen · July 13, 2024, 7:28am

Part of the problem with this, is that newer log entries and other updates may get lost. At my current Unix SysAdmin job we make changes to the OS every week to comply with rules and security updates, independent of OS patching. Even things like SUDOERS updates.

Unless the “new” OS is kept up to date, you really can’t wait for even a single day to reboot. Basically, you need to reboot immediately after the non-live file system update is complete. I am talking Enterprise usage here, home or small business could / would be different.

Solaris 11 does copy over known log files and such during activation of the ABE. (That occurs after patching, but before reboot.) But almost certainly does not copy over the SUDOERS file and other changes that occurred after the ZFS snaphot.

Yes, that would be preferable. It would be nice if their were a standardized OpenZFS ABE for Linux, with non-live OS update.

Hmmm, does FreeBSD include such?

They do have OpenZFS alternate boot environments included as part of the OS.

Krisbee · July 13, 2024, 8:44am

IIRC there’s no “clone” command in BTRFS. You’ll often see BTRFS snapshots describd as “a subvolume that is a clone (reflink) of another subvolume” which can be can be created as read-write or read-only. File modifications in a snapshot do not affect the files in the original subvolume. So rather different to ZFS read-only snapshots which you can clone and promote to reverse the parent-child relationship between the snapshot and it’s clone.

If a modern COW filesystem is fundamental to the “transaction update” concept, I suppose to could be done in ZFS, assuming the many tools SUSE had to develop to make this work contain no technical barriers.

Skimming the ref., two statements struck me.

1. A transactional system needs strict separation of applications, configuration and user data. Data in /var must not be available during the update.

Using BTRFS and snapper in a more conventional way for its BE type function way has always meant making sensible choices about subvolume layout and which parts are included in snapshots. Continuity of logs being one thing to think about. No different really to FreeBSD when choosing the system ZFS dataset layout.

2. Only if the update was the successful as a whole the system will boot into the new snapshot.

Define successful. Applying a set of updates which in themselves appear to work doesn’t guarantee a 100% working system after re-boot.

In the world of OpenZFS on Linux, zfsbootmenu is one tool that gives you a way to work with “boot environments” with a degree of ease. The boot menu of zfsbootmenu provides a “snapshot” section where you can snapshot an existing BE, clone and clone & promo, etc. It’s very comprehensive tool.

winnielinnie · July 13, 2024, 12:57pm

Still trying to wrap my head around this. Reading more about “transactional updates”, it seems to differ from the paradigms we’re familiar with on ZFS (and snapshots / rollbacks in general). See @Krisbee’s reply above.

In its defense, “transactional updates” is debuting with openSUSE Aeon, which specifically caters to ease-of-use. Their target base are home users and developers who “just need my PC to work”. It doesn’t sounds like they are marketing for enterprise.

Just look at their website:

But that still modifies the live filesystem, as far as I understand. So, too, does the current implementation in several Linux distros that use BTRFS: they have menu entries in GRUB that allow you to boot into a previous “snapshotted root”. (Both implementations still modify the live filesystem if you run an update while using the running system.)

I’m pretty familiar with the basics of ZFS at this point, but reading that type of description for BTRFS just makes my head spin. Do you know of a good primer for BTRFS? (Easy to understand, sort of like Ars Technica’s primer for ZFS?)

I’m still annoyed that you need to “mount” a subvolume in order to change its properties. (Such as changing the inline compression property.)

Sounds like it’s possible in the future:

That’s not a unique problem for “transactional updates”. Such can occur on any system, CoW or not, snapshots or not.

Their idea of success is that the update process completed 100%. (Not partially, no broken packages, no dependency issues.) That’s good to have as a bare minimum before rebooting your system, regardless of boot environments or snapshots.

Krisbee · July 14, 2024, 2:52pm

A decent BTRFS primer: Btrfs filesystem wiki with guides, facts and best practices. | Forza's Ramblings

Arwen · July 15, 2024, 11:15am

To give a bit of history to BTRFS, one thing that has improved is boot ability.

Specifically, the original method for selecting root BTRFS snapshot / sub-volume was by numeric ID passed on the kernel command line in Grub. This allowed you to select which ABE to use, (Alternate Boot Environment). Different Grub entries could boot older BEs, or different Linux distros.

However, you could not get the numeric ID until you created the BTRFS snapshot or sub-volume. Then you basically had to edit both any existing BEs, and new BE to include that in Grub’s /etc/grub.d/X entry. Otherwise, a simple mistake could make a BE appear to vanish in Grub. YES, THAT HAS HAPPENED TO ME! (And potentially edit /etc/fstab on all BEs too.)

Now it should be said that BTRFS supports/ed a “default” BTRFS sub-volume. That can be the “normal” boot environment. However, using such means you have to change your “default” BTRFS sub-volume before you reboot to another boot environment. So, no Grub selected alternate boot environments were possible, (when using only “default” BTRFS sub-volume).

Later, (still before I migrated to OpenZFS for my Linux computers), BTRFS added the ability to use the NAME of the snapshot / sub-volume in the kernel command line in Grub. This meant you could edit /etc/grub.d/X before you performed the snapshot. (And potentially /etc/fstab too.) That meant the change automatically applied to the new BE. (But, still not any older BEs.) Much more user friendly.

Note that Grub 2.0’s automatic generation of entries has been worthless for me. First, it did not accommodate A/B hard partition alternate boot environments. (Which I used before BTRFS & OpenZFS.) Then not BTRFS alternate boot environments. Later, not OpenZFS alternate boot environments.

As an example, this is what I use in Grub for OpenZFS;
linux /linux-6.1.31-gentoo.2 root=ZFS=rpool/root/20240714 rootfstype=zfs ...
A change in the date to match the ZFS clone’s name, and I am good for another BE.

Something similar, showing the BTRFS numeric ID of the pool, but the NAME of the sub-volume;
linux /btrfs-root/boot/vmlinuz-linux root=UUID=9c92ed7c-e128-4b60-a9de-9c8419fe083d ro root=subvol=btrfs-root quiet add_efi_memmap

ericloewe · July 15, 2024, 12:52pm

The one thing that unites ZFS and btrfs users. A profound distaste for GRUB’s abysmal usability.

Krisbee · July 15, 2024, 3:53pm

BTRFS users have grub-btrfs, I don’t know if @Arwen has used that on gentoo

systemd-boot is an alternative, which is used by opensuse-aeon.

ericloewe · July 15, 2024, 4:44pm

The only correct solution, in my view, is one that has finally been gaining traction:

Skip the stupid bootloaders and have UEFI load the kernel and other required bits directly.

This is what ZFS Boot Menu does, and the result is effectively still a bootloader, but one that’s much more user- and admin-friendly, and which sidesteps all the crap around GRUB.

probain · July 15, 2024, 5:00pm

Netgate implemented this with their pfSense+ (24.03) in their ZFS implementation.
Redmine Ticket

pmh · July 15, 2024, 5:40pm

That’s the standard FreeBSD bootloader and ZFS boot environments at work under the hood. FreeBSD can do this out of the box.

probain · July 15, 2024, 5:51pm

The booting part yes… I was more hinking about the “installing update to mounted snapshot”-part

ericloewe · July 15, 2024, 5:51pm

Miraculous what can be done when an OS is developed as an OS rather than as a kernel. Userland? Who cares, that’s somebody else’s problem. /s

pmh · July 15, 2024, 5:52pm

FreeBSD does that out of the box, too. Netgate provided a UI for it.

winnielinnie · July 15, 2024, 7:25pm

Looks like some recent happenings. (But I’m not complaining!)