Feature Request: SLOG sharing boot pool disks

Disclaimer
First off, I know this is not currently supported by TrueNAS. I have read a number of articles how this is possible (but unsupported) in the forum. I am not interested in pursuing an unsupported possibility. The purpose of this new thread is to explore the reason why this can or cannot be supported by TrueNAS in the future (i.e., if this is a bad idea, why is it a bad idea?).

Use Case
I have edge servers that are relatively small where SATA ports are in limited supply (total of 9). The servers have 128GB of (ECC) memory to run a few VMs at the edge in addition to serving all the data needed locally at the site. Currently, there are 6x 8TB HDDs (WD Red) making up the main pool in a 3x2 mirrored VDEV configuration. And, there are 2x 2TB SSDs (CMR) making up the mirrored boot pool. (They are 2TB because they were cheap, not because I needed 2TB.) There is a 16GB swap partition.

I am serving iSCSI zvols from the main pool to the VMs. Everything else are NFS datasets. The TrueNAS server is also used to backup devices that are local to the site.

Every disk is mirrored so that a single device failure does not create an emergency but rather a planned repair.

Context
In the documentation,

A separate high-speed SLOG device provides the performance improvements so ZIL-based writes are not limited by pool input/outputs per second (IOPS) or penalized by the RAID configuration. Using a SLOG for ZIL is recommended for database applications, NFS environments, virtualization, and data backups. In general, a storage environment with heavy synchronous writes benefits from using a SLOG for the pool ZIL.

Question
In the same way that TrueNAS will configure a 16GB swap partition on the mirrored disks that are serving the boot pool, is there a downside to have TrueNAS also create a similar, additional 16GB partition as an option to serve as a mirrored VDEV to host a SLOG for the main pool?

Well, splitting a slog so you can have a single slog device be used by multiple pools is one thing.

Splitting it so you can the mirror it is a different thing, and creates a false sense of redundancy while probably causing i/o bottlenecks as the same device is asked tomorrow everything it does.

As I understand it, the mirror part of the suggestion is orthogonal to the issue of sharing the disk.

The obvious downside is that an SLOG requires a fairly specialized SSD, whereas the boot pool needs something that is not a complete dumpster fire of a disk. That’s not conducive to a good user experience if you then start randomly suggesting to clueless user #1573 that they can add an SLOG partition (which they probably don’t need) to their boot media (which is probably not suitable in the first place).

1 Like

I don’t understand what you mean by “splitting a slog”.

I only have a boot pool and the “main” pool on these edge servers. Currently, I have the ZIL on the data disks serving the main pool. I want to move this to SSDs but don’t have two available SATA ports to do this.

The boot pool on the SSDs have A LOT of available space. I want to use 16GBs of the 2x SSDs to be the SLOG for the main pool.

I missed that you have two boots that you want to partition and then have a mirror.

I thought you wanted to partition another SSD and then add both partitions as a mirrored slog.

You can do what you want. It’s unsupported. Requires command line hacking (and in fact installer hacking), and I don’t think doing this will become supported anytime soon.

AND it’s unlikely that the boot disk is a suitable device for use as a slog.

You do realize you need power loss protection on the SLOG, if you actually want the benefits of sync writes, right? “Free space” is not even on the list of relevant criteria when choosing a suitable SLOG device, given the relatively tiny needs in terms of space.

If you don’t care, then just disable sync writes and be done with it. No performance penalty, and no illusions.

1 Like

@Stux
I specifically indicated that I do not want to follow an unsupported path and that I was asking about why this can or cannot be a feature in TrueNAS (for a small edge server with one data pool).

The boot devices are not very busy and have lots (100s of GBs) of available space. They are SSDs while the data disks are HDDs. The documentation says that

You can further improve ZIL performance by using a dedicated vdev called a separate intent log (SLOG). A SLOG moves the ZIL to a dedicated SSD(s) instead of a section of the data disks to function. This can provide a large benefit due to lower latency of a SLOG on SSD(s) vs data disks.

Why do you think “it’s unlikely that the boot disk is a suitable device as a slog”?

:point_down:

@ericloewe
I do but I am missing your point about power loss protection. The SSDs are non-volatile storage. And, the server as a whole is protected by a UPS giving it time to shutdown in the event of a prolonged outage.

And, yes, I do want to get the benefit for synchronous writes. The documentation says that

Disk latency and write endurance capability are the primary concern for slog devices. You might need multiple striped slog devices to reach write endurance and the synchronous write throughput needs of each slog device. Combined SLOG write throughput should be higher than the planned synchronous write throughput of the pool.

So, using a CMR SSD would satisfy my requirement.

Because you said they were “cheap” 2TB devices.

A suitable SLOG needs reliable high speed small block sequential sync write performance and very high endurance. Every sync byte received by the server is written to the SLOG using a sync write before being acknowledged to the client.

That means either enterprise style power loss prevention support (flash, power backed ram) or something else magic (Optane)

Do your boot disks match that?

That’s an abstraction. SSDs have caches and they’re unpredictable (to us they are, anyway).

Power outages are not the main concern, because the other end of the connection is probably also going down in the middle of doing something - literally anything else that kills your server is: panics, hardware failures, etc.

1 Like

Conventional Magnetic Recording Solid State Disk.

Well. Yes. HDs do work. As long as they implement sync write support correctly. But they are slow.

This is why SLOG was invented. To make sync writes faster than HDs.

The original Sun implementation used battery backed RAM disks iirc

By “cheap”, I meant relative to retail price - i.e., I got a good deal. Perhaps, “inexpensive” would have been a better word.

I am using Samsung Enterprise SSDs. If these are not sufficient, in your opinion, I am open to recommendations for better quality SSDs.

Do all enterprise SSDs have this caching issue? If true, why does the documentation suggest that they would be suitable when other failure concerns are strongly called out (e.g., be sure to use ECC memory)?

How does one check that sync write support is implemented correctly on the SSD specification (i.e., what does one look for)?

Enterprise SSDs usually have the PLP support required to have good sync performance (it’s one of the differentiators between enterprise and consumer)

You can benchmark one of the drives to see.

(Or you might even find a benchmark)

Thank you for the reference. But, back to my original question.

Assuming that I have a pair of enterprise SSDs with the PLP support required to have good sync performance and given that enterprise SSDs are always significantly larger than are needed for a boot pool

Question
In the same way that TrueNAS will configure a 16GB swap partition on the mirrored disks that are serving the boot pool, is there a downside to have TrueNAS also create a similar, additional 16GB partition as an option to serve as a mirrored VDEV to host a SLOG for the main pool?

Relative to the PLP support, I found this thread on the old forum.

And, reviewing this whitepaper from Samsung, I should have considered the SM863 for write-intensive applications. The 870 Evo does not have (anywhere documented) PLP support.

Looking at the latest in the Samsung Enterprise SSDs, the PM1733 NVMe SSD is the one to consider.

1 Like

LOL, in what crazy world is the EVO series considered enterprise? Not even Samsung themselves market them as enterprise. I cant even find one mention of the word enterprise on that page. Unlike this page here where they clearly mention it right in the heading.

EVO is their budget line. I thought you’d at least choose the PRO line (which is also not enterprise), but at least a step up. You need to go to at lesst the (S/P)M-863 series to get entry level enterprise.

1 Like

Please no.

  • SLOG duty is a write-intensive application that gnawls at a drive’s endurance;
  • You require said drive to perform to its best in order to not further slow down your pool.

i get the point though. I have also have been wondering if there was anything i could to make use of the free space on my mirrored boot 240Gb SSD’s but so far, i haven’t really found anything worth the hassle. I have considered moving my jails the SSD’s but that would get me into trouble if my boot pool crashes and i would need to reinstall.

In the end, i do not want to get creative and endanger the system as it is running very well and the potential performance gain would not justify putting the system at risk.

@Whattteva
Yes. Big error on my part.

I thought CMR was all that mattered. I understand now that PLP support is also required. The drives have already been replaced.

Now that the type of drives for my boot pool have been addressed, are you able to answer my original question?