M.2 NVMe Recommendation

Well, i don’t think PLP is mandatory for Data vdevs, if you have a proper UPS which can handle the load. But only for SLOG/L2ARC devices.

Although, I’m not sure whether PLP is what will give sustained Read/Write speeds. Do you have any recommendations? What NVMe are you using?

Yes, I’m aware of this fact!

But does that means the enterprise drives are DRAM less (cacheless) NVMes as they have better sustained Read/Write speeds.

I think you’re confusing PLP for sync writes vs the power loss protection that a UPS offers. Sync writes offer you safety and that safety is a guarantee that in the event of some catastrophic failure (not necessarily your main power supply loss), the drive can guarantee that whatever it tells the controller that it has received is actually safe (read sync-write) and won’t be lost. Drives that do not have PLP cannot make this guarantee, hence why their sync write performance is far lower.

Read above. The DRAM needs PLP to actually guarantee data in-flight is safe from in the event of catastrophic failure and will guarantee that it will flush any remaining data in-flight. Again, drives without this feature cannot make this guarantee hence sync writes performance suffers. Once again, this does not apply to all other operations (eg. reads, async writes).

As for drives, I am currently not using any NVMe drives. I am just using cheap second-hand enterprise SATA SSD’s. Mostly Intel SS DC series and Samsung SM series. You do not need this if your workload does not require a lot of sync writes or you can disable it altogether (typically will result in orders of magnitudes performance boost) if the safety tradeoff is worth it for you. It really depends on your risk tolerance.

No, enterprise drives have those too, but they’re protected, hence why they can make this guarantee, which in turn, improves sync write speeds.

1 Like

Will throw in the mix Kingston KC3000 series - rock solid, Do no thave the issues Samsung 980 Pro’s had with firmware, or the 990 Pro’s people saw with performance tanking… Agree Samsung quality seems to be going down hill from their earlier SSD / NVMe days

1 Like

So, from what i understand is, only PLP drives are the drives which offers sync writes performance is higher? Is that you what you mean?

No doubt

@etorix @ericloewe @joeschmuck What do you think about this Card?

As per the specs, it supports 256Gbs. If i am not wrong, this means the card can individually perform like 6.4GB/s roughly? Provided that the AIC is installed on the CPU (x16 lanes).

PLP drives are the drives which offers sync writes performance is higher safely.

See this thread: Does a pool of mirrored VDEVs benefit from an SLOG? - #15 by NickF1227

It’s fine, but I won’t make any claims about it working well at PCIe 4.0 speeds - lots of variance across motherboards has led to some passive adapters having some trouble at the higher speeds.

Works without any issues at PCIe 3.0 speeds in my experience.

PCIe 4.0 is roughly 2 GB/s, so each SSD would be 8 GB/s for a total of 32 GB/s.

1 Like

Oh, i see. Was not aware of this fact. Thanks for letting me know!

Can you explain this please?

2 GB/s per lane. 4 lanes per M.2. 4 M.2 slots in the card, 16 lanes in total.

1 Like

Oh, well. That’s what i roughly guessed :slight_smile:

BTW, how many lanes does a Gen3 NVMe uses?

I ran the same tests that @NickF1227 had displayed above, performed on a RAIDZ1 of four Gen4 NVME drives, in an non-compressed dataset and these are my results.

root@truenas[/mnt/farm/nocompress]# time dd if=/dev/zero of=./zeros bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 11.8932 s, 4.4 GB/s
dd if=/dev/zero of=./zeros bs=1M count=50000  0.02s user 6.49s system 54% cpu 11.895 total

root@truenas[/mnt/farm/nocompress]# time dd if=/dev/zero of=./zeros bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 11.9149 s, 4.4 GB/s
dd if=/dev/zero of=./zeros bs=1M count=50000  0.01s user 6.68s system 55% cpu 11.984 total

root@truenas[/mnt/farm/nocompress]# time dd if=./zeros of=/dev/null bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 11.5992 s, 4.5 GB/s
dd if=./zeros of=/dev/null bs=1M count=50000  0.02s user 11.46s system 99% cpu 11.600 total

root@truenas[/mnt/farm/nocompress]# time dd if=./zeros of=/dev/null bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 10.8099 s, 4.9 GB/s
dd if=./zeros of=/dev/null bs=1M count=50000  0.02s user 10.60s system 98% cpu 10.811 total

root@truenas[/mnt/farm/nocompress]# 

And to add a bit to these tests, a little random action to simulate real data.

root@truenas[/mnt/farm/nocompress]# time dd if=/dev/random of=./zeros bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 88.5059 s, 592 MB/s
dd if=/dev/random of=./zeros bs=1M count=50000  0.02s user 85.89s system 97% cpu 1:28.55 total

root@truenas[/mnt/farm/nocompress]# time dd if=./zeros of=/dev/null bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 10.1702 s, 5.2 GB/s
dd if=./zeros of=/dev/null bs=1M count=50000  0.02s user 10.08s system 99% cpu 10.171 total

root@truenas[/mnt/farm/nocompress]# time dd if=/dev/random of=./zeros bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 89.7006 s, 584 MB/s
dd if=/dev/random of=./zeros bs=1M count=50000  0.00s user 85.53s system 95% cpu 1:29.78 total

root@truenas[/mnt/farm/nocompress]# time dd if=./zeros of=/dev/null bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 10.8433 s, 4.8 GB/s
dd if=./zeros of=/dev/null bs=1M count=50000  0.01s user 10.62s system 98% cpu 10.844 total

root@truenas[/mnt/farm/nocompress]# time dd if=./zeros of=/dev/null bs=1M count=50000
50000+0 records in
50000+0 records out
52428800000 bytes (52 GB, 49 GiB) copied, 10.7881 s, 4.9 GB/s
dd if=./zeros of=/dev/null bs=1M count=50000  0.02s user 10.55s system 97% cpu 10.789 total
root@truenas[/mnt/farm/nocompress]#

The purpose of using if=/dev/random means the data isn’t a lot of the same character which can be compress, however it does not demonstrate the fastest possible throughput, just closer to real world. I’m not talking about being compressed when writing the data to the drive but rather internally within the system tells the drive to write a zero, 10 bazillion times. The instruction is sent fast, vice having to send each value individually. I’m making it simplistic and taking liberties here just to make it relatable.

Summary: My RAIDZ1 (4 NVME drives) using the zero’s results in 4.4/4.5 GB/s Write to 4.9 GB/s Read. Using random numbers it ranges from 580 MB/s Write to 4.8GB/s Read. So the data makes a difference. I would love to test using a MIRROR configuration however I’m not ready to destroy my pool for that.

4 lanes. The speed is just slower.

2 Likes

Thank you for the results. Really appreciate that. Finally, i have some clue. What speeds are you getting on SMB though? Are you on 10GbE network or higher?

LOL, I max out my 1Gbit connection. It’s just a small home setup, nothing fancy. I used NVME for the lack of spinning drive noise and power draw, but mostly because I just wanted it.

Not sure I will ever jump up to a faster LAN unless it is wireless, which I’m not a huge fan of. I like a combination of Wired and Wireless.

1 Like

And Optane.

1 Like