Hardware transcoding in Tdarr not working

duderuud · October 30, 2024, 6:37pm

Fresh install of 24.10 with a Nvidia T600 card, Nvidia driver is installed.

All looks fine, I can run “nvidia-smi” (in docker container and TrueNAS shell) and Tdarr itself states “Transcode GPU” (instead of “Transcode CPU”). So all looks fine.

But transcoding is extremely slow and Netdata (with the nvidia-smi plugin) shows an Nvidia GPU utilization of 0% so it’s clearly not working.

Any idea how I can debug this?

Tyler_Shield · October 30, 2024, 8:17pm

For reference here is my working Tdarr stack

networks:
  main:
    name: main
    external: true
services:
  tdarr:
    hostname: tdarr
    image: ghcr.io/haveagitgat/tdarr:${VERSION}
    container_name: tdarr
    user: 0:5000
    expose:
      - 8265
      - 8266
    restart: unless-stopped
    environment:
      - TZ=${TZ}
      - PUID=5000
      - PGID=5000
      - UMASK_SET=002
      - serverIP=0.0.0.0
      - serverPort=8266
      - webUIPort=8265
      - internalNode=false
      - inContainer=true
      - ffmpegVersion=6
      - nodeName=MyInternalNode
    networks:
      - main
    labels:
      - traefik.enable=true
      - traefik.http.routers.tdarr.entrypoints=websecure
      - traefik.http.routers.tdarr.rule=Host(`${TDARR_DN}`)
      - traefik.http.routers.tdarr.tls=true
      - traefik.http.services.tdarr.loadbalancer.server.port=8265
      - traefik.http.routers.tdarr.middlewares=authelia@docker
    deploy:
      resources:
        limits:
          cpus: "8"
          memory: 8G
    volumes:
      - ${TDARR_SERVER}:/app/server
      - ${TDARR_CONFIG}:/app/configs
      - ${TDARR_LOGS}:/app/logs
      - ${MEDIA}:/media
      - ${TDARR_CACHE}:/temp
      - ${TDARR_ALTERNATE_LIBRARY}:/alternate_library
  tdarr-cpu-node:
    image: ghcr.io/haveagitgat/tdarr_node:${VERSION}
    container_name: tdarr-cpu-node
    restart: unless-stopped
    network_mode: service:tdarr
    environment:
      - TZ=${TZ}
      - PUID=5000
      - PGID=5000
      - UMASK_SET=002
      - nodeName=CPU
      - serverIP=0.0.0.0
      - serverPort=8266
      - inContainer=true
      - ffmpegVersion=6
    volumes:
      - ${NODE_CPU_CONFIG}:/app/configs
      - ${NODE_CPU_LOGS}:/app/logs
      - ${MEDIA}:/media
      - ${TDARR_CACHE}:/temp
      - ${TDARR_ALTERNATE_LIBRARY}:/alternate_library
    deploy:
      resources:
        limits:
          cpus: "8"
          memory: 8G
  tdarr-gpu-node:
    image: ghcr.io/haveagitgat/tdarr_node:${VERSION}
    container_name: tdarr-gpu-node
    runtime: nvidia
    restart: unless-stopped
    network_mode: service:tdarr
    environment:
      - TZ=${TZ}
      - PUID=5000
      - PGID=5000
      - UMASK_SET=002
      - nodeName=GPU
      - serverIP=0.0.0.0
      - serverPort=8266
      - inContainer=true
      - ffmpegVersion=6
      - NVIDIA_VISIBLE_DEVICES=${NVIDIA_2060}
    volumes:
      - ${NODE_GPU_CONFIG}:/app/configs
      - ${NODE_GPU_LOGS}:/app/logs
      - ${MEDIA}:/media
      - ${TDARR_CACHE}:/temp
      - ${TDARR_ALTERNATE_LIBRARY}:/alternate_library
    deploy:
      resources:
        limits:
          cpus: "8"
          memory: 8G

I have found tdarr needs

runtime: nvidia

and

environment:

NVIDIA_VISIBLE_DEVICES=all

If you are getting the base tdarr program up with no nodes

something like this

networks:
  main:
    name: main
    external: true
services:
  tdarr:
    hostname: tdarr
    image: ghcr.io/haveagitgat/tdarr:${VERSION}
    container_name: tdarr
    runtime: nvidia
    user: 0:5000
    expose:
      - 8265
      - 8266
    restart: unless-stopped
    environment:
      - TZ=${TZ}
      - PUID=5000
      - PGID=5000
      - UMASK_SET=002
      - serverIP=0.0.0.0
      - serverPort=8266
      - webUIPort=8265
      - internalNode=true
      - inContainer=true
      - ffmpegVersion=6
      - nodeName=MyInternalNode
      - NVIDIA_VISIBLE_DEVICES=all
    networks:
      - main
    volumes:
      - ${TDARR_SERVER}:/app/server
      - ${TDARR_CONFIG}:/app/configs
      - ${TDARR_LOGS}:/app/logs
      - ${MEDIA}:/media
      - ${TDARR_CACHE}:/temp

Bear in mind you would need to clean up the
network, user, and the paths based on your use case

also check in the tdarr container your GPU with an nvidia smi should be there… it is how a juggle a multiple nvidia card system by assigning apps a specific card… Also use a tdarr plugin that is nvidia only that way it will never transcode with CPU as a fall back

duderuud · October 31, 2024, 1:18am

Will check my yaml files, thanks!