So we’re getting some Dell R630 servers in which we plan to run 2 pools (4x 4TB Drives Mirrored (“raid10”) so 8TB), and 2 of them, so 2x 8TB striped mirrored vdevs - These are used as VM storage over NFS (where NFS is the only thing we know but have the option in XO to use ISCSI as well) - All storage is SSD’s
The servers are setup with dual power supplies (seperate power sources), dual CPU’s / memory and redundant 10GB network between the devices, backup generator electricity but more importantly we do a replication task of the above servers to a secondary “failover” TrueNASes on an hourly basis.
We don’t have dedicated SLOG’s yet, but considering to get Radium or Optanes from ebay.
So my understanding here is with how data is written:
- ASYNC write requests waits for acknowledgement from memory, not persisted disks. This is the fastest possible speed we will get, but the risk is that sudden power less, kernel panic etc will lead to data loss (and possibly corruption of VHD files) as what the other end expected to be written to disk has not been done.
- SYNC writes wait until an acknowledgement is done from storage. This is the slowest path but the most guarantee that there would not be data corruption during the unexpected.
- To middle ground is to have a SLOG (which I believe is default in Truenas which allows for the data to be written to a SLOG (which by default is part of the pool) but as this happens in default scenario, this will have some degraded performance as the disks is used for both the SLOG and permanent storage. The advantage however is that during an unexpected power loss, the ZIL can be read back from the SLOG to complete the last transactions that was previously considered written and it’s high5’s all around.
- The purpose thus of a dedicated SLOG (not part of the pool) is to speed up this process. We want to get a SLOG that is as close as possible to RAM speed so that the SLOG can contain transactions so that a confirmation of storage written is sent back over NFS faster. Although storage has not been written to disk yet (it’s in memory and written to the SLOG) the confirmation was sent to the other end as the SLOG contains a “persisted guarantee” and this may improve performance (considering the SLOG is faster than the SSD’s). Lastly, the SLOG is never used to write to the disks (it still writes from memory), the SLOG is only a “backup” of storage to be written in the event of unexpected power loss. So the SLOG is simply a temp space for in memory transactions to offload storage / intent log to a “persisted storage” allowing for an “earlier” guarantee of storage written over SYNC - Data will still be written to the permanent storage from RAM, but as the SLOG confirmed it has the transactions, it can confirm data is written in the sync request.
So here is the questions:
- Is my understanding above correct?
- It seems over the years that the opinion of having a mirrored SLOG has changed. I understand the logic behind having it mirrored as this will give us somewhat of a guarantee that even in the unexpected (and further unexpected SLOG failure during this period of time as well), the SLOG will still be available. Without the SLOG (so SLOG failure during unexpected failures), TrueNAS will not be able to mount the pool and we’ll need to boot into an emergency mode and run some commands to bring the pool back up, and then start assessing the damage. It does however seem that the opinion shifted with better storage options such as the Radian or Optane drives. It seems that the Radiam devices offers power loss protection as well as the Optane devices. With these devices in place, is it REALLY needed to have mirrored SLOGs? My biggest concern is that the R630 only has 3 PCI-e slots, and if we have 2 pools / vdevs, we will need 2, and we want to add an additional 10GB SFP+ card for network redundancy. I understand it’s measurement of risk, but curious to know if SLOG’s in 2024 is still mirrored with better hardware in place.
- Assume I know nothing about Optane’s or Radian devices (as that would be the safest assumption). From what I can gather, the Radian devices offer an almost unlimited TBW / endurance where Optane’s have a much larger endurance than standard enterprise SSD’s but still has a “limit”. What is currently considered a better option between say a Radian RMS-200 and Optane 900/905P? I cannot really seem to find the RMS-300’s on ebay, but a couple of RMS-200s. As the R630 is DDR4, my logical opinion is that by this simple math, RMS-200’s will by default be slower simply not the memory speed difference. I assume that the Optane 900/905P’s will be faster in this regard, but obviously has a pitfall of “eventually” reaching the TBW limits. What is considered the industry standard with regards to these?
- If we go the Radian route (most seem to be 8GB units), will this be suffice on a 10GB network with 4x4TB’s on a pool? If my math is correct, the maximum traffic we will reach over 10Gbit network will be around 1.25GB per second. If the persisted storage has around a hard limit of 500MB/s (SSD), without a SLOG in place we’re working on around 3 seconds to have an entire 1.25GB written (best case scenario). With a Radian RMS-200 on 8gb storage cap, we should still be within the limit (considering best case scenarios) - I know my math here is probably not real world scenario, so hoping someone can give me a better explanation