A discussion on SLOG/ZIL device

Great discussion here. I think the results are contrary to most of what people read, so I hope we help at least some people to understand a mirrored SLOG is really an overkill item for most use cases. I don’t even think its necessary in an enterprise environment, to be honest.

edit: And I’ll be more clear why. You need two rare events to occur at nearly the exact same time. I feel like you’re more likely to have catastrophic system failure that takes down the system in the middle of a RAM write or a catastrophic RAM error causes a kernel panic and you lose that write than you are losing it from a non-mirrored SLOG drive failing exactly at the same moment that you are trying to recover from another rare failure.

Fixing that for you:

SLOG is only for sync writes. Most home use cases do not need sync to begin with. (And sync writes in TimeMachine backups may be left to the ZIL: There’s no user-facing performance to gain from speeding up a background task.)

1 Like

I don’t agree at all.

Sync writes are forced on almost every machine i build. As has been discussed above, it is cheap, easily accessible insurance for the second most likely cause of dataloss on ANY system, which is sudden powerloss while data is in RAM.

Sync writes are not some anomaly and are not some ‘background task’ with no effect. They take forever on many pools and are necessary On many workloads even of a homeuser.

I just completely disagree with what you’ve said. You’re not ‘fixing’ anything. Sudden powerloss occurs with enough frequency in the real world to be defended against if you care about your data. Sync writes do this and become bearable with a SLOG. Mirroring the slog is what we are talking about here, and the fact that it is mostly unneccessary.

Doubling this. I’ve disabled sync for the TimeMachine share and have seen no performance gains. At least perceivable ones.

Name a few, please. Like 3-5 or so. It would be even nicer if these workloads were not zfs-specific, but truenas-specific.

I told you the biggest one: any workload where you want to avoid dataloss after powerloss. ANY. You will almost assuredly lose data on power loss without sync writes. But more specifically, any iscsi workloads, VM datastores for proxmox or other hypervisors, nfs share mounts used for anything, proxmox backup server backups, veeam workstation backups. I’ll add more if you really want. You have pretty tame homelab needs if you see no benefit to sync writes. You also take large risks on powerloss, but you do you. (And this is all zfs specific, im talking about zfs only. Truenas is zfs.)

1 Like

NFS is async by default.

Proxmox Backup Server is also one place where I could not care less. Even if there is a power loss that results in data loss (I doubt this is even possible, since it issues a sync at the end of a backup job) thanks verification, it does not matter much.

It is?

1 Like

Somehow a homeuser became a homelabber in transition…


It is actually should be counted as one.

You mean PBS that uses truenas as its datastore’s backend? Or just PBS running on top of ZFS?

Don’t have experience with it. However, if we are talking about power loss, I assume that there would be a failed backup. In my POV, a failed backup is not a data loss.

Yeah, I’m just really curious whether I really need SLOG without realising it.

No, it’s not. It’s a “true” network-attached storage (that uses ZFS under the hood). And accessing storage via network differs from accessing it locally.

I think we’re off track here. I could care less if you want to take risks with powerloss or don’t do workloads requiring sync writes like I and many others do. This thread is about and for people who have decided they want to enforce sync writes or must enforce sync writes and so they need a SLOG for greater than snail’s pace performance. When this decision is made, do you mirror that SLOG or not? That is what we were discussing before a bunch of you decided to try and defend your use of async in your pools. I don’t really care, and many of us don’t. Please start another thread to defend that position; I’m sure its great for you. I’ve seen enough data loss and corruption from sudden powerloss to not think async is a great idea. NFS changes its defaults for light and casual use, but absolutely still recommends it for things like VM storage.

No, it’s not. It’s a “true” network-attached storage (that uses ZFS under the hood). And accessing storage via network differs from accessing it locally.

I know english isn’t your native language, but this is somewhat pedantic. Truenas uses ZFS exclusively to store files. It is the entirety of what we are talking about. It is the underlying system in which your files are stored. Its use of sync writes or not is dictated by dataset, which is one of the best features of ZFS.

I would love to have another thread to talk about whether sync writes are for your use case or not. I think it is a seperate thread though. Many will find this thread in searching for mirrored SLOG necessity, and I’d love for them to get the details of this discussion before we got sidetracked.

I would not sign this very generic statement;)

It totally depends on risk appetite and error probabilities.

If you have a wonky consumer nvme as slog because its still faster in sync than your old hdds - then sure, mirror’ed slog is a good idea.

If you have an Optane in an all Enterprise SAS drives chassis then it still can be a good idea if your data (VMs or whatever) is really mission critical.

Yes, a lot of things must go wrong at the same time to need the mirror’ed slog, but if your job depends on keeping the data consistent … (yes unnrealistic [I hope], just to make the point).

So - (Whats the chance of your server crashing mid write) x (whats the chance your slog is failing [right at that moment, silently, you not realizing it]) x (how important is this data) tells you if you need a mirror or not

2 Likes