The problem:
Connection to virtual machine (172.18.0.2, virtual bridge) via either ssh ProxyJump to truenas from my local machine or either separate ssh session to truenas from my local machine and then to 172.18.0.2. Opening nano and edit a file, save and close it. Then using up/down arrow to navigate through terminal history and eventually the connection is closed. Sometimes this happens already when editing the file in nano.
The Error:
Read from remote host x.x.10.4: Operation timed out
client_loop: send disconnect: Broken pipe
Connection to 172.18.0.2 closed by remote host.
Connection to 172.18.0.2 closed.
client_loop: send disconnect: Broken pipe
Additional information:
TrueNAS Scale version: 25.10.2.1
tcpdump on truenas showed a clean TCP FIN exchange initiated by truenas (172.18.0.1) toward the VM. This means the OS is intentionally closing the session, not crashing.
Connecting via ssh from my local machine using the assigned ip (x.x.12.x) from my router works without any problems/as expected.
I have tried to debug/fix this extensively using Google Gemini, but without any success. Feels like there’s some bug somewhere, hence why I feel the need to create my first post here.
Anyone experienced similar problem, solved it, have any idea what else I can try and/or how to debug this further?
Thanks in advance
Full summary of investigation from Google Gemini:
Details
Here is a summary of our investigation into the SSH “Broken pipe” issue occurring when connecting to an Ubuntu VM (172.18.0.2) via a TrueNAS SCALE Bastion (x.x.10.4).
The Symptom
-
The Trigger: Closing the
nanoeditor and immediately browsing shell history (up-arrow) causes the connection to hang and then drop. -
The Error:
Read from remote host: Connection reset by peerfollowed byclient_loop: send disconnect: Broken pipe. -
The Core Discovery:
tcpdumpon the Bastion showed a clean TCP FIN exchange initiated by the Bastion (172.18.0.1) toward the VM. This means the OS is intentionally closing the session, not crashing.
What We Investigated & Ruled Out
1. Client-Side (macOS OpenSSH 9.9):
- Attempted to disable
ObfuscateKeystrokeTiming(incompatibility between 9.9 and 9.2). - Switched from
ProxyJumptoProxyCommandto bypass modern tunnel handling. - Adjusted
ServerAliveIntervalandTCPKeepAlivesettings.
2. Network Layer (MTU & Offloading):
- Lowered MTU on the Client, the Bastion bridge (
br2), and the VM interface to 1450/1400/1300 to rule out fragmentation. - Disabled hardware offloading (
TSO,GSO,TX-checksumming) on the virtual bridgebr2usingethtool.
3. Virtualization Layer (KVM/QEMU):
- Switched the VM network adapter from
VirtIOtoE1000to rule out driver-specific buffering issues. - Investigated
split_lock_mitigatekernel traps after finding#AC: CPU 1/KVMerrors in the kernel logs.
4. OS/Environment (TrueNAS SCALE/Debian):
- Created a new “clean” user to bypass any
adminaccount shell restrictions or middleware hooks. - Verified that
TMOUTwas not set and thatIPQoSsettings were not causing priority conflicts. - Ruled out IP conflicts with Docker (which uses the 172.16.x.x range).
The Conclusion
The issue is specific to the internal virtual bridge (br2) on TrueNAS SCALE which has no physical members. When a burst of interactive TTY data occurs (like exiting a full-screen app), the Linux kernel’s connection tracking or bridge logic on SCALE perceives a state violation and terminates the TCP session.
Network and virtual machine setup:
Details
Network interfaces:
- enp7s0: physical interface/port has untagged vlan 10 and tagged vlan 11, 12, 30 via switch and - makes truenas available via x.x.10.4 on the network
- br2: bridge with ip 172.18.0.1/24 and no bridge members
- vlan12: vlan with vlan tag 12 without any static ip
- br3: bridge with vlan12 as bridge member without any static ip
Ubuntu 24.04 virtual machine:
Network devices:
-
Adapter Type VirtIO
MAC Address 00:a0:98:74:d0:25
NIC To Attach br2
Trust Guest Filters not checked
Device Order 1002 -
Adapter Type VirtIO
MAC Address 00:a0:98:51:9e:17
NIC To Attach br3
Trust Guest Filters not checked
Device Order 1003
Ubuntu uses the following netplan config:
network:
version: 2
ethernets:
eth0:
match:
macaddress: "00:a0:98:74:d0:25"
set-name: eth0
dhcp4: false
addresses:
- 172.18.0.2/24
nameservers:
addresses:
- 172.18.0.1
eth1:
match:
macaddress: "00:a0:98:51:9e:17"
set-name: eth1
dhcp4: true
eth1 gets x.x.12.101 assigned via dhcp from my router and it can ping truenas via 172.18.0.1 and x.x.10.4, access the internet etc.