Installing NVIDIA grid driver in RC

I’m running TrueNAS as a Proxmox VM with NVIDIA VGPU passed in. I can see the T4 vgpu in lspci but the drivers being installed by the RC do not support the vgpu/grid and errors out.

I tried installing the grid driver manually but getting this error:
ERROR: Temporary directory /tmp is not executable - use the --tmpdir option to specify a different one.

Specifying different tmp folders have the same error. Developer mode had the same issue.
It worked in the BETA version. The driver I’m trying to install is : NVIDIA-Linux-x86_64-550.54.14-grid.run.

Any pointers appreciated.

Same try with same result.Have to use 24.4 and forget about vGPU. :sleepy:

I think at this point the only recourse is to submit a ticket / feature request to add the grid version as an option and one legacy driver option

I was hoping to run patched grid version, i suspect that’s out too.

I have tried all sorts of things to try and workaround the tmp issue.

  1. i am not convinced that --tmpdir= option is doing anything on the installer
  2. exporting TMPDIR to some path that is 100% executable isn’t working

I even tried making the root FS writable via grub, that didn’t help

it seems to be some restriction in the root user (i suspect not related to /tmp at all) and the error message isn’t accurate, but i am out of my experience here as to know where to look, i don’t know if this is part of the fs stil being RO (but suspect it is)

tl;dr the two commands for dev mode r/w do not give full access - they seem to be great for userland development but don’t seem to let us do kernel related stuff…

I understand the decision, this may push me to doing truenas in proxmox instead, but i don’t think ix-systems mind, they don’t seem to really want a software business, which is an ok call.

Also trying to install older nvidia driver. Not an exp linux admin.

Closest I’ve been able to get so far is to use sudo su before running the script. I also tried --tempdir pointing to a folder under a dataset mount.

Here is the command line I’m running:
/bin/sh ./NVIDIA-Linux-x86_64-304.137.run --tmpdir /mnt/star-nas/star-nas-share/nvidia

And here is the result:

chmod: changing permissions of '/mnt/star-nas/star-nas-share/nvidia/makeself.VlqEMhNH': Operation not permitted
sudo: process 15077 unexpected status 0x57f
ERROR: Temporary directory /mnt/star-nas/star-nas-share/nvidia is not executable - use the  --tmpdir option to specify a different one.

It looks like the script is able to start copying files but chmod is failing.

Temp directory permissions look like this:
drwxrwx--- 2 root truenas_admin 2 Feb 26 11:06 nvidia

Also used truenas UI to give POSIX_OPEN ACL to the dataset directory.

Version: ElectricEel-24.10.2

I would look into systemd-sysext and create an image to mount that way. There are examples out there on how to do this, below is one.

https://blogs.igalia.com/berto/tag/systemd-sysext/

This is what IX is doing currently with their Nvidia driver install which they changed to in 24.10.1. That is why it worked in the BETA and in 24.10.0. You could/should also put a feature request in for the grid drivers as well.

systemd-sysext 
HIERARCHY EXTENSIONS SINCE                      
/opt      none       -                          
/usr      nvidia     Mon 2025-02-17 11:38:54 EST
systemd-sysext list
NAME   TYPE PATH                       TIME                       
nvidia raw  /run/extensions/nvidia.raw Wed 2025-02-12 15:52:35 EST

@richc did you ever try this approach, this is interesting to me as i have been trying to install tjhe nvidia drivers i choose in dev mode and been wholly unable t - i will give it a crack if you have’t tried

I gave up on this. I spun up a JellyFin docker container inside an LXC and passed in one of the Proxmox nvidia vgpus for transcoding.

that example does nothin with system drivers

I had a go today at doing a full systemd-ext in a cloned boot partition were i enabled dev mode, got quite far, but it seems there maybe kernel protections in place to stop me running .run as root (its possible to run as truenas_admin without sudo but not with)

supposedly (if i believe chatpgpt) this is being blocked at kernel level…

truenas_admin@truenas[/mnt/fast/nvidia]$ sudo strace ./NVIDIA-Linux-x86_64-535.230.02-grid.run
strace: test_ptrace_get_syscall_info: PTRACE_TRACEME: Operation not permitted
strace: ptrace(PTRACE_TRACEME, ...): Operation not permitted
+++ exited with 1 +++

running the same on say a proxmox system works

i don’t know if its worth me trying to create a systemx layer on another debian 12 based machine as without the kernel and headers from truenas i am unclear that layer would ever work on truenas?

tl;dr even in dev mode there seem to be a lot of other items left behined for immutability :frowning: and protections from doing certain things as root / sudo