Network connection dropout and WebUI not responding

Sorry this is going to be a long one as this is so tiring for us, we’ve been fighting with our TrueNAS instance for a while now and we’re ready to throw in the towel with this. So last week I did a fresh install of Core 13.0-U6.1 on a machine which we have been previously using for the same purpose.

I setup the same raidz1 pool and the same SMB share in our offline network. This time round I also put in a 10gb Intel NIC and gave that a static IP. The SMB share was working, machines could see it and I could map the network drive on the windows server.

As this raidz1 pool is purely another one of our backups, I started Macrium to create our image then save it to this pool over SMB. All was going well when I started it Friday afternoon, until I come back to check this morning and find that the backup has failed due to the network connection dropping. This is exactly what we’ve been having problems with for so long.

Now I don’t know if this due to working with large data (approx 30tb) or something else like potential hardware failure. Even after reboot the WebUI is now not accessible. Attached some pictures of what I was greeted to this morning. Will be getting logs a little later too.

**Update 1 - no display output, no kb or mouse response. Seems like it’s actually borked

Hardware choices are unorthodox I know but it’s what we have to deal with atm:

  • Intel Core i3-12100
  • ASRock Z690 PG Riptide LGA1700
  • 32gb non-ecc memory (2400mhz)
  • Teamgroup MS30 256gb m.2 ssd
  • SeaSonic Focus 650w 80+ Platinum
  • x6 6tb HDD setup in raidz1
  • 1gig Intel NIC card (connects to switch and uses DHCP)
  • 10Gtek X520-10G-1S-X8 (connects direct to another server, static IP)

For clarification you are not getting any output using a monitor directly attached to the server?

Yep no output from monitor after multiple reboots. Believe might have had some bad memory as I removed 2 sticks and it booted and the SMB share worked fine straight away. I started the backup again last night so I will be checking soon to see if the network share is still up.

I would run memtest86 asap. Also make sure to properly cool your 10G card, they can get pretty hot. Finally, are you sure you want to run 6 drives in RAIDZ1? Assessing the Potential for Data Loss | TrueNAS Community

Yes defo need to be running tests on the RAM, thank you for recommendation.

Already have added extra fans into the system, it’s unfortunately quite a toasty room with quite a few machines in there.

Regarding our choice for RAIDZ1, we did feel it was the suitable option to maximize capacity whilst retaining some redundancy.
For context, we have a processing server running x5 SSDs in RAID0. Users purely copy their work onto this, complete the work and that’s it. Once the user has completed work they will copy and hash the data to our main server which is a RAID50 array giving us about 90TB or so. We then have 3 backups, 1 is this TrueNAS instance, remaining 2 are DAS’s which alternate weekly between our other office. Then have split up the data on main server and saved into cold storage again offsite just in case.
So we would like to think we have thought our backup procedure through well but of course am happy to listen to suggestions. Our biggest problem is that we are dealing with large volumes of data so frequently, as even now our TrueNAS instance will not have enough capacity within the next month so will need to be expanded.

1 Like

You might want to look at larger drives then. 90TB is a big number to reach with 6TB drives.

Yes we are definitely going to need both bigger capacity drives and/or different chassis to accommodate for additional drives.

Well going back to our initial issue it seems the SMB share is still up and running so happy days for me. Bad news for the memory though, as it’s days are most likely now over :sweat_smile:

1 Like