Hello!
On 24.10.2, I had almost 2 months of uptime.
After upgrrading to 25.04-RC.1, every morning my TrueNAS hangs:
Should I return to 24.10.2?
Thanks!
Hello!
On 24.10.2, I had almost 2 months of uptime.
After upgrrading to 25.04-RC.1, every morning my TrueNAS hangs:
Should I return to 24.10.2?
Thanks!
After searching on Google, I found the following solutions, with some small variations between them:
ethtool -K eno1 tx off rx off
ethtool -K eno1 tx off rx off gso off gro off tso off
ethtool -K eno1 tso off gso off
Iām not sure which one I should use.
You should provide more details about your system as it can make a big difference in the answer. Please refer to the link below called Joeās Rules.
As for rolling back, that is probably the most sound thing to do however if this is repeatable, you should file a jira ticket to report the failure through the TrueNAS GUI.
Those commands look like flow controls for the NIC. Not sure which one, if any, will help you.
This is the safest way to test these commands out.
Good luck.
admin@truenas[~]$ sudo ethtool -i eno1
driver: e1000e
version: 6.12.15-production+truenas
firmware-version: 0.2-4
expansion-rom-version:
bus-info: 0000:00:1f.6
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
admin@truenas[~]$ sudo lspci
00:00.0 Host bridge: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers (rev 0a)
00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630]
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10)
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10)
00:1f.0 ISA bridge: Intel Corporation Z390 Chipset LPC/eSPI Controller (rev 10)
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
00:1f.5 Serial bus controller: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10)
admin@truenas[~]$ sudo lspci -v
00:00.0 Host bridge: Intel Corporation 8th Gen Core Processor Host Bridge/DRAM Registers (rev 0a)
DeviceName: Onboard - Other
Subsystem: Micro-Star International Co., Ltd. [MSI] 8th Gen Core Processor Host Bridge/DRAM Registers
Flags: bus master, fast devsel, latency 0
Capabilities: [e0] Vendor Specific Information: Len=10 <?>
Kernel driver in use: skl_uncore
Kernel modules: ie31200_edac
00:02.0 VGA compatible controller: Intel Corporation CoffeeLake-S GT2 [UHD Graphics 630] (prog-if 00 [VGA controller])
DeviceName: Onboard - Video
Subsystem: Micro-Star International Co., Ltd. [MSI] CoffeeLake-S GT2 [UHD Graphics 630]
Flags: bus master, fast devsel, latency 0, IRQ 124
Memory at a0000000 (64-bit, non-prefetchable) [size=16M]
Memory at 90000000 (64-bit, prefetchable) [size=256M]
I/O ports at 3000 [size=64]
Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
Capabilities: [40] Vendor Specific Information: Len=0c <?>
Capabilities: [70] Express Root Complex Integrated Endpoint, MSI 00
Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [100] Process Address Space ID (PASID)
Capabilities: [200] Address Translation Service (ATS)
Capabilities: [300] Page Request Interface (PRI)
Kernel driver in use: i915
Kernel modules: i915
00:08.0 System peripheral: Intel Corporation Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
DeviceName: Onboard - Other
Subsystem: Micro-Star International Co., Ltd. [MSI] Xeon E3-1200 v5/v6 / E3-1500 v5 / 6th/7th/8th Gen Core Processor Gaussian Mixture Model
Flags: fast devsel, IRQ 255
Memory at a103a000 (64-bit, non-prefetchable) [disabled] [size=4K]
Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
Capabilities: [dc] Power Management version 2
Capabilities: [f0] PCI Advanced Features
00:12.0 Signal processing controller: Intel Corporation Cannon Lake PCH Thermal Controller (rev 10)
DeviceName: Onboard - Other
Subsystem: Micro-Star International Co., Ltd. [MSI] Cannon Lake PCH Thermal Controller
Flags: fast devsel, IRQ 16
Memory at a1039000 (64-bit, non-prefetchable) [size=4K]
Capabilities: [50] Power Management version 3
Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
Kernel driver in use: intel_pch_thermal
Kernel modules: intel_pch_thermal
00:14.0 USB controller: Intel Corporation Cannon Lake PCH USB 3.1 xHCI Host Controller (rev 10) (prog-if 30 [XHCI])
DeviceName: Onboard - Other
Subsystem: Micro-Star International Co., Ltd. [MSI] Cannon Lake PCH USB 3.1 xHCI Host Controller
Flags: bus master, medium devsel, latency 0, IRQ 121
Memory at a1020000 (64-bit, non-prefetchable) [size=64K]
Capabilities: [70] Power Management version 2
Capabilities: [80] MSI: Enable+ Count=1/8 Maskable- 64bit+
Capabilities: [90] Vendor Specific Information: Len=14 <?>
Kernel driver in use: xhci_hcd
Kernel modules: xhci_pci
00:14.2 RAM memory: Intel Corporation Cannon Lake PCH Shared SRAM (rev 10)
DeviceName: Onboard - Other
Subsystem: Intel Corporation Cannon Lake PCH Shared SRAM
Flags: fast devsel
Memory at a1032000 (64-bit, non-prefetchable) [disabled] [size=8K]
Memory at a1038000 (64-bit, non-prefetchable) [disabled] [size=4K]
Capabilities: [80] Power Management version 3
00:16.0 Communication controller: Intel Corporation Cannon Lake PCH HECI Controller (rev 10)
DeviceName: Onboard - Other
Subsystem: Micro-Star International Co., Ltd. [MSI] Cannon Lake PCH HECI Controller
Flags: bus master, fast devsel, latency 0, IRQ 123
Memory at a1037000 (64-bit, non-prefetchable) [size=4K]
Capabilities: [50] Power Management version 3
Capabilities: [8c] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [a4] Vendor Specific Information: Len=14 <?>
Kernel driver in use: mei_me
Kernel modules: mei_me
00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI Controller (rev 10) (prog-if 01 [AHCI 1.0])
DeviceName: Onboard - SATA
Subsystem: Micro-Star International Co., Ltd. [MSI] Cannon Lake PCH SATA AHCI Controller
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 122
Memory at a1030000 (32-bit, non-prefetchable) [size=8K]
Memory at a1036000 (32-bit, non-prefetchable) [size=256]
I/O ports at 3090 [size=8]
I/O ports at 3080 [size=4]
I/O ports at 3060 [size=32]
Memory at a1035000 (32-bit, non-prefetchable) [size=2K]
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [70] Power Management version 3
Capabilities: [a8] SATA HBA v1.0
Kernel driver in use: ahci
Kernel modules: ahci
00:1f.0 ISA bridge: Intel Corporation Z390 Chipset LPC/eSPI Controller (rev 10)
DeviceName: Onboard - Other
Subsystem: Micro-Star International Co., Ltd. [MSI] Z390 Chipset LPC/eSPI Controller
Flags: bus master, medium devsel, latency 0
00:1f.4 SMBus: Intel Corporation Cannon Lake PCH SMBus Controller (rev 10)
DeviceName: Onboard - Other
Subsystem: Micro-Star International Co., Ltd. [MSI] Cannon Lake PCH SMBus Controller
Flags: medium devsel, IRQ 16
Memory at a1034000 (64-bit, non-prefetchable) [size=256]
I/O ports at efa0 [size=32]
Kernel driver in use: i801_smbus
Kernel modules: i2c_i801
00:1f.5 Serial bus controller: Intel Corporation Cannon Lake PCH SPI Controller (rev 10)
DeviceName: Onboard - Other
Subsystem: Micro-Star International Co., Ltd. [MSI] Cannon Lake PCH SPI Controller
Flags: fast devsel
Memory at fe010000 (32-bit, non-prefetchable) [size=4K]
00:1f.6 Ethernet controller: Intel Corporation Ethernet Connection (7) I219-V (rev 10)
DeviceName: Onboard - Ethernet
Subsystem: Micro-Star International Co., Ltd. [MSI] Ethernet Connection (7) I219-V
Flags: bus master, fast devsel, latency 0, IRQ 120
Memory at a1000000 (32-bit, non-prefetchable) [size=128K]
Capabilities: [c8] Power Management version 3
Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+
Kernel driver in use: e1000e
Kernel modules: e1000e
admin@truenas[~]$ sudo ifconfig -a
br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet 10.0.0.8 netmask 255.255.255.0 broadcast 10.0.0.255
inet6 fd98:e1d2:7c83:0:2002:59ff:febf:d90c prefixlen 64 scopeid 0x0<global>
inet6 2804:1b2:9500:5838:2002:59ff:febf:d90c prefixlen 64 scopeid 0x0<global>
inet6 fe80::dcd7:d8ff:fefe:7317 prefixlen 64 scopeid 0x20<link>
ether xx:xx:xx:xx:xx:xx txqueuelen 1000 (Ethernet)
RX packets 8355492 bytes 12044168172 (11.2 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 5966443 bytes 81461499093 (75.8 GiB)
TX errors 0 dropped 16 overruns 0 carrier 0 collisions 0
eno1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::2d8:61ff:fe0e:f637 prefixlen 64 scopeid 0x20<link>
ether xx:xx:xx:xx:xx:xx txqueuelen 1000 (Ethernet)
RX packets 12474157 bytes 15221788861 (14.1 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 6564442 bytes 2260505979 (2.1 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
device interrupt 16 memory 0xa1000000-a1020000
incusbr0: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
inet 10.162.23.1 netmask 255.255.255.0 broadcast 0.0.0.0
inet6 fd42:fa96:99d6:155f::1 prefixlen 64 scopeid 0x0<global>
ether xx:xx:xx:xx:xx:xx txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 88 overruns 0 carrier 0 collisions 0
lo: flags=73<UP,LOOPBACK,RUNNING> mtu 65536
inet 127.0.0.1 netmask 255.0.0.0
inet6 ::1 prefixlen 128 scopeid 0x10<host>
loop txqueuelen 1000 (Local Loopback)
RX packets 72513 bytes 45960468 (43.8 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 72513 bytes 45960468 (43.8 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
tailscale0: flags=4305<UP,POINTOPOINT,RUNNING,NOARP,MULTICAST> mtu 1280
inet yyy.yyy.yyy.yyy netmask 255.255.255.255 destination yyy.yyy.yyy.yyy
unspec 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 txqueuelen 500 (UNSPEC)
RX packets 151048 bytes 8044146 (7.6 MiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 207688 bytes 328009213 (312.8 MiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
tapda261bbb: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
ether xx:xx:xx:xx:xx:xx txqueuelen 1000 (Ethernet)
RX packets 2035808 bytes 1517024387 (1.4 GiB)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 4006434 bytes 83672060940 (77.9 GiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
vethe57e653: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet6 fe80::88ce:8cff:fe44:bfd prefixlen 64 scopeid 0x20<link>
ether xx:xx:xx:xx:xx:xx txqueuelen 0 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 1297 bytes 614353 (599.9 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
I just did ethtool -K eno1 tso off gso off
. No problem⦠so far.
If it lasts 1 week, then you likely fixed it. I say 1 week because 1 day is not enough to positively state it is fixed.
Let us know how it turns out.
The command disables hardware offload, so expect some lower performance, but otherwise should be safe. The question is whether the driver in the new version started to use some functionality that is buggy in hardware, or it is a regression in the kernel driver itself. Quite likely it is model-specific, so Iād search for feedback specific to the model and the 6.12 kernel version.
@joeschmuck @mav, ok, Iām gonna wait 1 week to see if this fixed it.
Anyway: 118721 ā e1000e hardware unit hangs when TSO is on
From comment #11 onwards, the reports are identical to mine: everything worked fine on kernel 6.6, but the problems started after updating to kernel 6.12.
Ticket created: Jira
Thanks, guys!
@protoman Would be good to bump the mentioned Linux kernel issue to keep them aware. Weād be happy to include whatever upstream patches it produce, but we have no realistic means to debug it ourselves.
Uptime: 6 days 17 hours 2 minutes as of 09:26.
I think the problem is solvedā¦