SOLVED - Dragonfish 10GB/s limit on 100GB/s Interfaces

helojunkie · April 27, 2024, 1:15am

I am running 2 x Dragonfish-24.04.0 installs connected via 100GB utilizing Cisco Nexus 9K switches. I am seeing zero errors on either interface, both systems are connected at 100GB/s, but yet when I am replicating snapshots one to the other I am seeing a cap at just over 8GB/s.

I know standard Linux needs some tuning to really get the funn bandwidth, but I didn’t want to go messing with tunables in Dragonfish until I knew for sure this was the issue.

Are there tunables specific to Dragonfish and 100GB Mellonox cards so I can get this resolved?

Many Thanks

Stux · April 27, 2024, 1:24am

100gbps links (because I doubt you have a tbps link) and you’re seeing 8GB/s?

Which seems like a pretty good result to me.

Or do you mean 8gigabits per second?

Stux · April 27, 2024, 1:27am

Meanwhile, even getting to 10gbps takes some doing. A pair of HDs is not going to do it.

So what is your storage?

Or are you just showing iperf results?

helojunkie · April 27, 2024, 1:28am

I was hoping for much closer to 12GB/s which would be near the max of the 100GB (100000000 Kbit) interfaces. Am I asking too much? 8GB/s is only about 2/3rds line rate unless I miscalculated.

helojunkie · April 27, 2024, 1:29am

No, this is what Dragonfish is showing during the actual transfer. I’m not sure how to upload an image or I would post the image from the dashboard.

Stux · April 27, 2024, 1:30am

How are you measuring this?

I will admit, I have zero experience with 100gbit networking.

but, I do believe it’s non trivial to saturate it.

helojunkie · April 27, 2024, 1:41am

This is just what the Scale Dashboards show, which is also what my Cisco N9Ks show.

helojunkie · April 27, 2024, 1:45am

Actually, Dragonfish is showing 8Gb/s, not 8GB/s, so I am WAY off, but maybe it’s a drive saturation issue?

Stux · April 27, 2024, 1:49am

Probably.

What drives do you have. What is your pool layout?

helojunkie · April 27, 2024, 1:50am

Both systems:
18 x 18TB WD HC550 7200RPM 12GB/s SAS Hard Drives
2 x 9 Drive RAIDZ2 VDEVs

Stux · April 27, 2024, 1:52am

In order to hit 1GB/s on my 10gbit system I needed to switch my 18 hard drives to 9 mirrors.

helojunkie · April 27, 2024, 1:54am

Interesting. I think the next time I have a replication running I will run nmon and watch the drive utilization, maybe that is the bottleneck!

Thanks for the insight.

Stux · April 27, 2024, 2:01am

SAS3 is up to 12gbps per port.

But HDs typically peform at about 100-280MB/s best case depending on if they’re reading from the inner or outer edge.

Meanwhile how is your HBA connected to your system?

Mine is PCIe Gen 2 8x. Which is good for a MAX of 4GB/s.

Maybe yours is Gen 3 8x and good for 8GB/s.

8GB/s is not going to saturate 100gbit. You’d need PCIe Gen 4 or 16x Gen 3 to do that.

And more or faster storage than you have.

helojunkie · April 27, 2024, 5:50pm

So I am running the LSI3008 in a Gen3 8x slot which should give me the full 12Gb/s SAS speed according to the Broadcom website.

I did a smaller replication this morning and was monitoring my drive utilization with nmon and all of the drives across the array hit over 80% utilization, so I am guessing you were spot on with the limitation being the drives and not the network.
I am going to check with a much larger replication later today.

etorix · April 27, 2024, 7:33pm

Thanks for getting the units right: Bytes with a capital ‘B’ vs. bits with a small ‘b’.

I have a flash-only NAS with two 8-wide raidz2 vdevs (mix of enterprise 3.84 TB drives), all attached to a 9305-16i. While scrubbing, TrueCommand reports a reading speed of 8-9 GBytes/s. You are NOT going to even approach this speed with two raidz2 vdevs of spinning drives; 8 Gbits/s looks like a reasonable mark.

I think you can mark the thread as “solved”, with a negative answer: There’s no hard limit (or not this one) in software, but your pool is very far from being capable of saturating a 100 GbE link.

helojunkie · April 27, 2024, 9:50pm

Thank you and @Stux both for helping me figure this out as a array issue and not a network issue. As I expand the units I suspect the speed will increase as I add more vdevs.