Truenas Scale Dragonfish - Nvidia Quadro P600 not detected by nvidia-smi

I’m struggling to get the Nvidia Quadro P600 graphic card be detected by the nvidia drivers. The device shows up with lspci but if I try to use nvidia-smi it doesn’t get listed.

The VGA is supported on the driver (See Linux x64 (AMD64/EM64T) Display Driver | 550.76 | Linux 64-bit | NVIDIA)

The machine is a ProLiant MicroServer Gen10 Plus

OS Version:TrueNAS-SCALE-24.04.0
Product:ProLiant MicroServer Gen10
Model:AMD Opteron(tm) X3418 APU
Memory:15 GiB
root@truenas[~]# nvidia-smi
No devices were found
03:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P600] (rev a1) (prog-if 00 [VGA controller])
        Subsystem: NVIDIA Corporation GP107GL [Quadro P600]
        Physical Slot: 1
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 44
        IOMMU group: 2
        Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: Memory at fe0000000 (64-bit, prefetchable) [size=256M]
        Region 3: Memory at ff0000000 (64-bit, prefetchable) [size=32M]
        Region 5: I/O ports at c000 [size=128]
        Expansion ROM at fe000000 [virtual] [disabled] [size=512K]
        Capabilities: [60] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000  Data: 0000
        Capabilities: [78] Express (v2) Legacy Endpoint, MSI 00
                DevCap: MaxPayload 256 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: CorrErr+ NonFatalErr+ FatalErr+ UnsupReq+
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop-
                        MaxPayload 256 bytes, MaxReadReq 512 bytes
                DevSta: CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 8GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <512ns, L1 <4us
                        ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp+
                LnkCtl: ASPM L0s L1 Enabled; RCB 64 bytes, Disabled- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s (downgraded), Width x8 (downgraded)
                        TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Range AB, TimeoutDis+ NROPrPrP- LTR+
                         10BitTagComp- 10BitTagReq- OBFF Via message, ExtFmt- EETLPPrefix-
                         EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
                         FRS-
                         AtomicOpsCap: 32bit- 64bit- 128bitCAS-
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- 10BitTagReq- OBFF Disabled,
                         AtomicOpsCtl: ReqEn-
                LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
                LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
                LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete+ EqualizationPhase1+
                         EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
                         Retimer- 2Retimers- CrosslinkRes: unsupported
        Capabilities: [100 v1] Virtual Channel
                Caps:   LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb:    Fixed- WRR32- WRR64- WRR128-
                Ctrl:   ArbSelect=Fixed
                Status: InProgress-
                VC0:    Caps:   PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb:    Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl:   Enable+ ID=0 ArbSelect=Fixed TC/VC=01
                        Status: NegoPending- InProgress-
        Capabilities: [250 v1] Latency Tolerance Reporting
                Max snoop latency: 0ns
                Max no snoop latency: 0ns
        Capabilities: [128 v1] Power Budgeting <?>
        Capabilities: [420 v2] Advanced Error Reporting
                UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
                UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
                CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
                AERCap: First Error Pointer: 00, ECRCGenCap- ECRCGenEn- ECRCChkCap- ECRCChkEn-
                        MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
                HeaderLog: 00000000 00000000 00000000 00000000
        Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Capabilities: [900 v1] Secondary PCI Express
                LnkCtl3: LnkEquIntrruptEn- PerformEqu-
                LaneErrStat: 0
        Kernel driver in use: nvidia
        Kernel modules: nouveau, nvidia_current_drm, nvidia_current
root@truenas[~]# midclt call system.advanced.config | jq
{
  "id": 1,
  "consolemenu": true,
  "serialconsole": false,
  "serialport": "ttyS0",
  "serialspeed": "9600",
  "powerdaemon": true,
  "swapondrive": 8,
  "overprovision": 0,
  "traceback": true,
  "advancedmode": false,
  "autotune": true,
  "debugkernel": false,
  "uploadcrash": true,
  "anonstats": true,
  "anonstats_token": "",
  "motd": "Welcome to FreeNAS",
  "boot_scrub": 7,
  "fqdn_syslog": false,
  "sed_user": "USER",
  "sysloglevel": "F_WARNING",
  "syslogserver": "",
  "syslog_transport": "UDP",
  "syslog_audit": false,
  "kdump_enabled": false,
  "isolated_gpu_pci_ids": [],
  "kernel_extra_options": "pci=realloc=off",
  "syslog_tls_certificate": null,
  "syslog_tls_certificate_authority": null,
  "consolemsg": false
}

I got some more useful info, looks like the device fails to init properly:

[Apr24 15:35] kube-bridge: port 9(veth459fd26b) entered blocking state
[  +0.000976] kube-bridge: port 9(veth459fd26b) entered forwarding state
[  +8.473456] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x24:0x72:1436)
[  +0.007556] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[  +0.274182] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x24:0x72:1436)
[  +0.006436] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[  +8.687074] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x24:0x72:1436)
[  +0.005804] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[  +0.252903] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x24:0x72:1436)
[  +0.006317] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[  +0.504299] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x24:0x72:1436)
[  +0.005503] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[  +0.334823] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x24:0x72:1436)
[  +0.005979] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[  +0.392519] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x24:0x72:1436)
[  +0.007475] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
[  +0.232436] NVRM: GPU 0000:03:00.0: RmInitAdapter failed! (0x24:0x72:1436)
[  +0.005512] NVRM: GPU 0000:03:00.0: rm_init_adapter failed, device minor number 0
root@truenas[~]# cat /proc/driver/nvidia/gpus/0000:03:00.0/information
Model:           Quadro P600
IRQ:             45
GPU UUID:        GPU-????????-????-????-????-????????????
Video BIOS:      ??.??.??.??.??
Bus Type:        PCIe
DMA Size:        47 bits
DMA Mask:        0x7fffffffffff
Bus Location:    0000:03:00.0
Device Minor:    0
GPU Excluded:    No

Have you checked If the p600 is supported by the upgraded Nvidia Driver?

Yeah, it’s in the list

Quadro Series:

Quadro GV100, Quadro GP100, Quadro P6000, Quadro P5200, Quadro P5000, Quadro P4000, Quadro P2200, Quadro P2000, Quadro P1000, Quadro P620, Quadro P600, Quadro P400, Quadro M6000 24GB, Quadro M6000, Quadro M5000, Quadro M4000, Quadro M2000, Quadro K2200, Quadro K1200, Quadro K620

1 Like