Dell 20NJD Mellanox Card Killing Other Network Ports?

Recently acquired some Dell 20NJD Mellanox Cards and upon installing they render the other current (and working) cards useless…also can’t reach the UI from the web interface, but can see everything up and running fine in the shell from IPMI. Once the card is removed from the system everything goes back to being fine.

System Info:

SuperMicro X11DPH-T
2x Xeon Gold 6126 CPU

Installed Network (working) card:
Mellanox ConnectX-4 MCX415A-CCAT CX415A - Connected via 25gbe breakout cable to USW-EnterpriseXG-24

New Cards:
Dell 20NJD Mellanox - currently flashed with generic ConnectX4 14.32.1010-MCX4121A-ACA firmware

100gbe card has been working fine as is - I acquired some of the Dell 20NJD cards - installed them as-is (with Dell firmware) and they showed up in the system, but made the 100gbe card connection broken (it showed as a cable disconnected on the switch, and all GUI were unreachable). I also have an onboard 10gbe nic plugged in to the switch (on different subnet) and that GUI is not reachable as well - so looks like it’s knocking all other nics out. I figured it might be a weird IP issue so I removed the IP config on those interfaces and set static routes to new subnets on the Dell card, but the 100gbe card still wasn’t connecting.

Steps I’ve Tried:
-Flashed the Dell card with corresponding Mellanox firmware based on some data from other users/websites - still the same issue.

-The card works fine in a Windows system and Linux system - presenting green lights on both ports.

-Checked the bios when the new card was installed and didn’t see anything abnormal in there…also not 100% what I would be looking for there.

-Tried difference PCI Slots (difference slots go to difference CPUs, so I tested the few I had open for each)

Any other thoughts or ideas what I can tinker with?

Had the same with my card, did you look at the Card setting when booting , you can get in the card setting by Ctrl+ B,

Had to allow ARP to get my card working.
was on a HP ML360 gen 10.

Maybe it works

What do you mean allow ARP? What kind of insane scenario would have ARP disabled? I hate that that is somehow an option, oh my god! How the hell does IP even work without ARP!? Are you supposed to manually feed the driver a MAC table!? And how is the driver or firmware even disabling ARP, are they silently inspecting all traffic and dropping ARP packets instead of passing them to the kernel?

God, the more I read about Mellanox hardware, the better understand why some of it is so damn cheap on the used market.

Well, in that case the good news is that the card is good. The bad news is that the Mellanox driver in FreeBSD is sucky. It’s possible that it’s improved in the upcoming TrueNAS 13.3, you can give the beta a shot and see what happens.
Alternatively, you can switch to Scale, where the card is expected to work well.

Spent a few hours scratching my head, trying a few other things - then called a real support person - turns out the new card was replacing my old card NIC info at the same time as creating new interfaces - we only noticed it after looking at the MAC addresses and realized the new card replaced the interface names which messed up all the routing - somehow wiped out the static routes as well so the gui wasn’t accessible. So wacky - but thankful that was all it was! Sometime we forget how bad the routing is…them other times I’m reminded of it (and my lack of understanding with it!).

But yes - cross grading to scale is really soon - I run one vital app on my server that I need to get moved either to a VM or docker for scale and I’ll he all good and will just be mad at myself for not doing it sooner :rofl::joy: