Network freeze after upgrade to 25.04

Happening every few hours

The network card in use is this on a 1GB network

02:00.0 Ethernet controller: Intel Corporation Ethernet Controller I226-V (rev 04)

Are there drivers I need to add?

The board also has this 10g nic

04:00.0 Ethernet controller: Aquantia Corp. AQC113C NBase-T/IEEE 802.3an Ethernet Controller [Marvell Scalable mGig] (rev 03)

Which I would move to except every time I try to create a bridge it locks me out and I have to reset the config.

Machine is in light use with just proxmox backup server as an lxc and 2 nfs shares for media (plex and jelltfin)

I experienced the same thing. Due to all the VM issues, I wanted to revert to 24.10. But I had to back up some containers running in my VM just to be sure. It took me several reboots to complete the backups and download them. I’m glad I reverted.

If I could just for now find a guide to successfully switch the cable to the Marvell 10G nic then to see if it works, that would be a great help.

I’m new to Truenas scale and it’s been a dream to work with except this nic swapping stuff which seems arcane TBH.

Moved the cable to the 10GB Marvell nic and the problem persists.

Network freezes after between 2 and 10 hours of operation.

My PSU was a cheap locally sourced one so I decided to replace it with a Thermalright TR-ASFX.

Problem persists.

At this stage I have replaced the motherboard + CPU & PSU, tested the RAM with memtest+ which passed, checked the bios for settings (c-states were off by default), removed a SSD which was being used as a cache. All the cabling is brand new. I’ve reseated all the cables at least once.

Only thing I haven’t swapped out at this stage is the OS SSD which is an old one but which passes smart tests

I am piping the syslogs to Graylog and nothing shows up out of the ordinary.

Before I upgraded to 25.04 I had the occasional crash. I haven’t had a crash since I upgraded but the NAS is unusable with this network freeze and I am at a lost as to how to fix it.

I’m inclined to think that it isn’t a HW fault at this stage but a problem with the nics and truenas scale 25.04

Has anyone encountered these nics with 25.04 and do you have problems?

What happens exactly when you say network ”freeze”? IPv4 or IPv6 or both, static ip or dhcp? What is truenas connecting to at the other end of the cable?

When I say freeze. Any media streaming stops, the gui is unreachable, I cannot ping it.

Static IPv4 connects to TP-Link Jetstream L2 managed switch

Ok. Any errors in the logs? Do you have a screen connected if so are there any error messages printed to console?

No errors in the console any time I’ve looked.

Closest thing to an error message in the Logs (debug level)

is:

Device: /dev/sdc [SAT], SMART Prefailure Attribute: 194 Temperature_Celsius changed from 66 to 65

Which is the OS SSD but the disk temp recording in Fangtooth in the gui is broken (known issue) so I’m not putting much credence in this tbh.

Disk doesn’t feel hot to the touch.

Ok that temperature is a bit high - shouldn’t in itself cause this kind of issue, but could indicate a broader temperature/cooling issue which can lead to crashes and weird problems. Do you run TrueNAS bare metal? You could try to boot another os (a minimal Linux) and run e.g. stress-ng in combination with s-tui to see if you can replicate the same issue and whether it’s related to overheating.

Another path to explore would be to use the console and cli from TrueNAS when your machine ”freezes” and check link status etc.

It runs bare metal in a new Jonsbo N4 case.

I didn’t just pick the build out of fresh air. It’s based on this:

https://blog.briancmoses.com/2024/11/diy-nas-2025-edition.html

So I know that with 24.10 it’s been tested to work with the 10GB nic and pass with flying colours which is why I choose it.

I foolishly upgraded the zfs pool and can’t now roll back

The machine is never under heavy load, it freezes with trivial network traffic of a couple of MB and 3% CPU usage.

The machine hasn’t crashed once since I upgraded.

All of which, after swapping out so much hardware, leads me to believe that this is a 25.04 issue.

If I could just get confirmation that others have this issue or that they don’t then I could discount it but until then…

I’ll check the console next time it happens to see if it has any pertinent info.

Thanks for your suggestions

Ok. Could of course be a kernel issue in 25.04 somehow trigged by / related to your particular hardware. But that would be exceedingly unluckily given that you’ve basically swapped everything and tried two different NICs too. Nevertheless, the fact you say you had random crashes also under 24.10 would point to h/w issues, and that SSD temp is a bit of a smoking gun…

Final idea - I see you’re running virtualisation under TrueNAS also - this is experimental with fairly major changes between every release, so somehow this could be your issue also. Try stopping all containers / VMs for a while and see if you still have problems.

To eliminate the SSD from the puzzle I’ve just ordered a new 64GB Samsung SSD and will swap out the OS SSD with it and check the temp.

I’ve already tried turning off the one lxc and one docker app running to no avail.

I have the 2.5 intel i226 nics (4 of them) on one of the nodes in my proxmox cluster which has my opnsense router running on it and they have never given me a moment’s bother.

I’ve also seen a vendor on Ali mention kernel issues with certain versions of bsd and linux based OSes and these nics so that’s another thing nudging me towards it being 25.04.

On the other hand if this was a widespread problem I wouldn’t be the only one on here asking questions and scratching their head surely?

I just want to fix it and move on

Swapped out the SSD, same issue.

The clean install of 25.04 had disk temp now available in the GUI which was broken in the upgraded version, all temps perfectly normal.

It seems to be a system freeze rather than just a network issue as I can’t access the console via the display port once it freezes.

I’ve now run out of HW to swap out.

I suppose I could test the memory again but it passed when I first set up this build.

Have tried removing SMB, shutting down lxc and docker app. Nothing has worked.

That sucks. I guess you’re then down to for whatever reason TrueNAS kernel/drivers being buggy with your hardware. Verify by stress testing for a while under a different os (you could boot another Linux from usb stick).

OK. I’ve run Ubuntu 24.04.2 on a USB on the box for 2 days without a system freeze.

Pinged it constantly with no issues, DP port still active.

I’m highly confident at this stage that the issue lies with 25.04 and is not a HW issue.

However, I’m no closer to knowing what the issue is or how to fix it or if in fact there is anything I can do except wait for the next point release and hope for the best.

How long do I wait before I have to look at new HW and how can I be sure that any new HW will continue to function after an upgrade?

Don’t know if that’s good or bad news… I guess the good news is you’ve narrowed it down to software at least and maybe specifically truenas 25.04. You could try raising a bug tickets with iX but they will likely expect you to be able to provide some kind of smoking gun - error messages, specific trigger scenarios, etc.

That’s precisely the issue now.

The closest I’ve been able to come to pinpointing a known issue is this from the official Topton Store:

Intel i226-V is a new network card, suggest to install newest version of following systems with supportable kernel: pfSense-CE 2.7, OPNsense 23.x, pfSense Plus, OpenWrt, ESXi, Proxmox 8.x, CentOS etc. Please make clear before buying.

I run Proxmox 8.4 and the latest OPNsense virtualised on these nics without issue.

If I can’t get a resolution I’m looking at a new board and chip at least or a data migration and role back to 24.10, neither of which is an appealing prospect.

Or virtualising truenas in proxmox? Depending on what the rest of your hardware is. Add complexity in some ways but works well for many and sounds like it could stabilise things for you from the comment above.

I ran Truenas 24.10 virtualised on Proxmox 8.3 as a test of TN before I committed to a bare metal install and it seemed to run perfectly.

I did however read a good few comments warning against it saying people had lost access to the TN when upgrading proxmox which is what swayed me to a bare metal install.

It could indeed be a solution to my predicament which I hadn’t thought of though, so thank you.

I’ll read up on virtualizing it again and the pros and cons while I wait for the point release.

There are definitely pitfalls, the most common seems to be if the ZFS pools are mounted by both the proxmox host and truenas simultaneously, quickly leading to corruption. Avoided by ensuring the disk controller is passed through to the truenas vm and ideally blacklisted for promox so it can’t be mounted even before the VM starts. An internet search will tell you how.

1 Like