Think outside the (TrueNAS) box

This is a celebration thread, so if you are looking for a problem to solve, move along. :grinning: I just now almost certainly finally solved a mystery which has been a low-grade irritant for months.

A TrueNAS installation which was otherwise stable would occasionally not come back from its nightly restart at 3AM. The system would boot, but it wasn’t on the network.

A user (usually me) would discover the server unresponsive and I’d have to go down to the basement to reboot it.

This happened once a week, maybe. I stupidly suspected TrueNAS or the hardware. But that wasn’t it.

I finally studied the system log starting at 3AM this morning and saw a pattern after the reboot: TrueNAS was repeatedly getting no response to its DHCP broadcasts.

What? I have no DHCP problems. That’s been working ā€œforeverā€. There are all kinds of devices on my network which have no problem getting an address. What’s so special about this particular attempt to get an addr… time! TIME! That has to be it.

(think think think)

OMG. My main router, which has the local DHCP server, also reboots every night at 3AM.

The DHCP server and the TrueNAS server were racing! Sometimes, the TrueNAS server beat the DHCP server and so wouldn’t get a DHCP address in time for other parts of TrueNAS to come up correctly.

So now I’ve told this installation of TrueNAS to reboot at 4AM instead. That gives the DHCP server nearly a full hour to come back up.

Sometimes you have to think outside the (TrueNAS) box.

1 Like

Glad you found it was a simple problem.

Curious as to why you reboot the Truenas server every night? Mine only get a reboot when I update to a newer version or patch.

1 Like

Curious to the nightly reboots are well, I’m running several hundred days uptime… I mean, kuddos for figuring out a solution.

Ditto with the DHCP server…

2 Likes

I am no longer certain why I started the nightly reboot. However, it seems likely this was because I was having trouble with dynamic DNS without understanding it at the time.

My ISP has comparatively short DHCP leases and is likely to hand out a different address each time. My dynamic DNS was flaky, so the A record would be intermittently stale.

I misdiagnosed the problem and decided to mask it with a nightly reboot hoping that would help. It kinda did help; most of the time, the reboot would resolve a stale A record by restarting the dynamic DNS monitor.

I eventually found and fixed the dynamic DNS issue for unrelated reasons and then was left with this issue.

The downside of a nightly reboot, of course, is that it might interrupt something happening at 3AM (or, now, 4AM). But that’s fine because the only thing likely to be in progress then is a periodic automated backup of a client machine left awake overnight, and the clients will happily recover from that soon enough.

Still, I am now curious how much uptime I will get, so I will probably disable the nightly restart soon and see what happens. One thing at a time! :grinning:

That configuration detail has a longer history and I remember even less about it. Like nothing. :slight_smile: I’m about to restart off-site overnight replication, which can involve looooong sessions, so I guess I had better stop doing this and hope nothing breaks.