I am trying to get nvidia acceleration working with Frigate and when I enable NVIDA GPU support, the app fails to start because
0/1 nodes are available: 1 Insufficient nvidia.com/gpu. preemption: 0/1 nodes are available: 1 No preemption victims found for incoming pod..
Digging further, the logs from nvidia-device-plugin-daemonset-gbdcp
show that it is dying:
2024/05/09 01:12:13 Starting FS watcher.
2024/05/09 01:12:13 Starting OS watcher.
2024/05/09 01:12:13 Starting Plugins.
2024/05/09 01:12:13 Loading configuration.
2024/05/09 01:12:13 Updating config with default resource matching patterns.
2024/05/09 01:12:13
Running with config:
{
"version": "v1",
"flags": {
"migStrategy": "none",
"failOnInitError": true,
"nvidiaDriverRoot": "/",
"gdsEnabled": false,
"mofedEnabled": false,
"plugin": {
"passDeviceSpecs": false,
"deviceListStrategy": "envvar",
"deviceIDStrategy": "uuid"
}
},
"resources": {
"gpus": [
{
"pattern": "*",
"name": "nvidia.com/gpu"
}
]
},
"sharing": {
"timeSlicing": {
"resources": [
{
"name": "nvidia.com/gpu",
"devices": "all",
"replicas": 5
}
]
}
}
}
2024/05/09 01:12:13 Retreiving plugins.
2024/05/09 01:12:13 Detected NVML platform: found NVML library
2024/05/09 01:12:13 Detected non-Tegra platform: /sys/devices/soc0/family file not found
2024/05/09 01:12:14 Starting GRPC server for 'nvidia.com/gpu'
2024/05/09 01:12:14 Starting to serve 'nvidia.com/gpu' on /var/lib/kubelet/device-plugins/nvidia-gpu.sock
2024/05/09 01:12:14 Registered device plugin for 'nvidia.com/gpu' with Kubelet
2024/05/09 01:13:22 Received signal "terminated", shutting down.
2024/05/09 01:13:22 Stopping plugins.
2024/05/09 01:13:22 Stopping to serve 'nvidia.com/gpu' on /var/lib/kubelet/device-plugins/nvidia-gpu.sock
I am new to TrueNAS, and I have nvidia support working with Plex, but I can’t do 2 different apps. I know my card can do 4 streams at once because I had it working in Unraid, but I am a bit stumped.