Deactivated SMB service

Hi all.

I need your help, because I don’t know where to hit my head anymore.
I currently have version 24.10.2.2 installed, but I already had the problem with 24.10.2.1, I don’t remember it never happened to me with 24.04.2.5…
TrueNAS is joined in a Windows AD domain.
For no apparent reason, and at ever changing and seemingly random times, the SMB service turns itself off, and users can no longer access the network share.
There are 3 DNS configured, all always online when the problem happens.
I would like some advice on troubleshooting, finding the cause, and resolving. It happens at least once/week now, and users are furious. If I don’t fix it quickly, I’m afraid I’ll have to find an alternative solution.
In the meantime, I will leave you a copy/paste of /var/log/syslog

Jun 21 09:30:33 xxxxxxxx systemd[1]: Stopping smbd.service - Samba SMB Daemon…
Jun 21 09:30:33 xxxxxxxx systemd[1]: smbd.service: Deactivated successfully.
Jun 21 09:30:33 xxxxxxxx systemd[1]: Stopped smbd.service - Samba SMB Daemon.
Jun 21 09:30:33 xxxxxxxx systemd[1]: smbd.service: Consumed 1h 41min 24.167s CPU time.
Jun 21 09:30:33 xxxxxxxx systemd[1]: Stopping winbind.service - Samba Winbind Daemon…
Jun 21 09:30:33 xxxxxxxx systemd[1]: winbind.service: Deactivated successfully.
Jun 21 09:30:33 xxxxxxxx systemd[1]: Stopped winbind.service - Samba Winbind Daemon.
Jun 21 09:30:33 xxxxxxxx systemd[1]: winbind.service: Consumed 1min 19.800s CPU time.
Jun 21 09:30:33 xxxxxxxx systemd[1]: Starting winbind.service - Samba Winbind Daemon…
Jun 21 09:30:33 xxxxxxxx systemd[1]: Started winbind.service - Samba Winbind Daemon.

Thank you very much!

Can you provide your system specs please.

Of course:
Dell PowerEdge R730
CPU: Intel(R) Xeon(R) CPU E5-2603 v4
RAM: 32GB ECC
Pool: 8x SAS 1TB HDD RAIDZ2
Network: Quad port 1Gb ethernet, connected via LACP bond (2 ports used) to a Cisco 9200L (port channel active ports)

What functionality are you using? SMB, NFS, VMs, Apps etc?

Can you think of any non-defaults you’ve applied at any point?

PS; do you have a large AD and if so have you disabled the cache when joining?

I use SMB and NFS, no Apps. AD cache is not disabled.
I would just like some suggestions for finding the logs, which would allow me to troubleshoot the reason for SMB deactivation.

Perhaps this issue could be linked? [KRB5KDC_ERR_PREAUTH_FAILED] Errors on AD quite often - #15 by Johnny_Fartpants

I’d say no…

I just signed up to say the exact same thing!

I just upgrade to 24.10 and the problem started immediately. Around 5:15AM (sometimes a bit more random), my log looks exactly like yours. It simple stops the smbd.service and winbind.service. It starts the winbind.service automatically but not the smbd.service.

I have checked every logs, journalctl, etc. I have not been able to find out what is causing or signalling the smbd.service to shutdown. It is not crashing and I can confirm with the logs.

At first I edited the systemctl smbd.service to restart always:

[Service]
Restart=always
RestartSec=10

But it does not restart the service since it is not crashing.

What works as a temporary workaround is to create a crontab which checks the smbd.service every minute and starts it if it is not running.

sudo nano /root/bin/restart_smbd_if_stopped.sh

The script:

#!/bin/bash
if ! systemctl is-active --quiet smbd; then
    systemctl start smbd
fi
sudo chmod 755 /root/bin/restart_smbd_if_stopped.sh
sudo nano /etc/crontab
* * * * * root /root/bin/restart_smbd_if_stopped.sh

It works for now and I have just checked.

I should also mention that I have a complicated setup. I have Truenas installed as a Proxmox VM with the SAS card directly passed through. I also run Windows Active Directory like you, but as a VM in the same Proxmox. The Proxmox host is absolutely fine. But it is scheduled to auto backup only the AD VM, not TrueNAS.

My theory was that the backup caused it to lose AD connectivity, stop the smbd.service and it does not auto start, even though I have auto start enabled in the TrueNAS GUI.

I don’t know about your setup and if the AD auto restarts or something but now I am guessing that is not the case. And my setup worked 100% rock solid for over 1.5 years before upgrading to TrueNAS 24.10 just a few days ago. So, something is definitely wrong in the 24.10 code.