Nvidia drivers for LLM computation, not video encoding

I’m trying to run Ollama locally, but am frustrated. I’ve selected Nvidia drivers be loaded from the TrueNAS GUI, but fear what driver gets loaded only apply to video encoding and rendering. I need to pass though the GPU capacity of the Nvidia Tesla P4 installed in my machine to handle Ollama and Open Web-ui workloads (both are running in a Docker container). When I look at the status of the installed GPUs, with the following, the resource is idle/off (sudo watch -n 0.5 nvidia-smi:

|========================+================================|
| 0 Tesla P4 Off | 00000000:03:00.0 Off | 0 |
| N/A 41C P0 22W / 75W | 0MiB / 7680MiB | 0% Default |
| | | N/A |
±----------------------------------------±-----------------------±---------------------+

I thought, perhaps the card was not supported, but it came back with a compute index of 6.1, just over what is required to run Ollama.

Thinking I just need to update the drivers, I tried to execute this command from Nvidia’s web site, only to be denied:

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list |
sed ‘s#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g’ |
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list

But generally, perhaps the better questions are: does/will TrueNAS support Nvidia driver toolkits for running LLM like Ollama and the like? If this support is already there, why doesn’t it work? If it’s supposed to work, how does one fix it, or will it be fixed in later releases of EE?

Thanks!

1 Like