I am running 2 x Dragonfish-24.04.0 installs connected via 100GB utilizing Cisco Nexus 9K switches. I am seeing zero errors on either interface, both systems are connected at 100GB/s, but yet when I am replicating snapshots one to the other I am seeing a cap at just over 8GB/s.
I know standard Linux needs some tuning to really get the funn bandwidth, but I didn’t want to go messing with tunables in Dragonfish until I knew for sure this was the issue.
Are there tunables specific to Dragonfish and 100GB Mellonox cards so I can get this resolved?
I was hoping for much closer to 12GB/s which would be near the max of the 100GB (100000000 Kbit) interfaces. Am I asking too much? 8GB/s is only about 2/3rds line rate unless I miscalculated.
So I am running the LSI3008 in a Gen3 8x slot which should give me the full 12Gb/s SAS speed according to the Broadcom website.
I did a smaller replication this morning and was monitoring my drive utilization with nmon and all of the drives across the array hit over 80% utilization, so I am guessing you were spot on with the limitation being the drives and not the network.
I am going to check with a much larger replication later today.
Thanks for getting the units right: Bytes with a capital ‘B’ vs. bits with a small ‘b’.
I have a flash-only NAS with two 8-wide raidz2 vdevs (mix of enterprise 3.84 TB drives), all attached to a 9305-16i. While scrubbing, TrueCommand reports a reading speed of 8-9 GBytes/s. You are NOT going to even approach this speed with two raidz2 vdevs of spinning drives; 8 Gbits/s looks like a reasonable mark.
I think you can mark the thread as “solved”, with a negative answer: There’s no hard limit (or not this one) in software, but your pool is very far from being capable of saturating a 100 GbE link.
Thank you and @Stux both for helping me figure this out as a array issue and not a network issue. As I expand the units I suspect the speed will increase as I add more vdevs.