OpenVPN networking after upgrading pools (feature flags)

dnilgreb · September 29, 2024, 1:01pm

I am running a TrueNAS Scale server (Dragonfish-24.04.2.2). On there I have an Ubuntu VM setup as an OpenVPN server. The server is upgraded to Scale from Core. I also have another, remote Truenas server, running Core (12.0-U8.1). This has an OpenVPN client service connected to the server on my Scale.

Before I upgraded the feature flags on my Scale server, the two servers communicated fine over the OpenVPN network. I use it to replicate snapshots between them. After the upgrade (of feature flags) the Scale cannot reach the Core server on its OpenVPN IP address. Neither can another OpenVPN client connected to the same OpenVPN server.

From my laptop, connected to the same LAN as my TrueNAS Scale, I can ping and trace to my OpenVPN server, using both the LAN IP and the OpenVPN server ip.

From my OpenVPN server VM, I can ping the Core server and other OpenVPN clients just fine. It is as if something is missing on the OpenVPN server all of a sudden, between the interfaces (eth and tun), and I can´t ping through them. Hope someone understands what I am trying to say. Does anyone know why this is happening, and what I can do to mend it?

Unsure what might be of interest config wise, so I won´t paste all of it now, but of course I will post relevant bits upon request.

dnilgreb · September 29, 2024, 1:28pm

I may have discovered something. I think the routing table on the OpenVPN server looks a little weird:

root@openvpn:/var/log/openvpn# route
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         USG             0.0.0.0         UG    100    0        0 ens3
10.8.0.0        10.8.0.2        255.255.255.0   UG    0      0        0 tun0
10.8.0.0        0.0.0.0         255.255.255.0   U     0      0        0 tun0
192.168.32.0    0.0.0.0         255.255.255.0   U     100    0        0 ens3
USG             0.0.0.0         255.255.255.255 UH    100    0        0 ens3

192.168.32.0 is my LAN
10.8.0.0 is my OpenVPN network

The OpenVPN server has 10.8.0.1.

I´m a little out of my depth here, and don´t want to make changes until I know exactly what needs to change. Would you say the above looks normal, and if not, what needs to change?

Bingo600 · September 29, 2024, 2:38pm

Edit: I totally forgot the Scale was not running the OVPN Server - It’s running on a separate VM.

The questions below reflects my “lack of memory”

There’s some mismatch in your statements above (at least from a network perspective)

Scale can not reach Core

From Scale i can ping Core

I assume your Core machine has the 10.8.0.2 IP , and that OpenVPN is connected between the Scale (10.8.0.1) & Core (10.8.0.2) via Tun0.

1: TLDR Prob. Unrelated but …

That makes me wonder a bit that you can ping the Scale OVPN IP (10.0.8.1)
Technically your laptop would have def-gw on the USG i assume, and should send packages destined for 10.8.x.x to def-gw (unless you have specifically set a route to 10.8.0.0/24 on the lappy, or have fired up an OVPN client to connecto to the Scale).
So unless the Scale (debian) is doing proxy-arp, there shouldn’t be an answer from the Scale OpenVPN IP

2:
How are you referring from the Core to the Scale when pinging ?
Via 10.8.0.2 or 192.168.32.x (The Scale Lan IP) ?

Does it work if you (from Core) are pinging the OpenVPN ip of the Scale (10.8.0.1) ?

I order for the Core to route traffic to the Scale Lan IP (or other Lan) , the Core has to be told (route) “somehow” that 192.168.32.x is “behind” 10.8.0.1 (Scale VPN IP) , or Core VPN Client should have def-gw via the Scale (an OVPN option).

If the Core doesn’t know the route to your local LAN (192.168.32.0/24) , then it wouldn’t be able to ie. answer a ping from your lappy (or Scale LAN), as the answer would go to the Core “Local” def-gw.

3:
If i were you i’d try to ping from Core to Scale (but 10.8.0.1) , to see if that is succesfull … Unless you have some kind of fwall it should be, when the OVPN clinet s connected. Since Scale & Core would be “directly connected” via the 10.8.0.0/24 net.

If that works …
Then OVPN works , and you prob. have a routing issue on Core , if you need to access any IP on Scale LAN.

A routing table from the Core machine (when OVPN connected to the Scale) would be nice.

And a : list of ip interfaces : ip a
on both machines would be nice too.

My money is on (if no fwall is involved): Core machine doesn’t know how to route packages to your Scale LAN. … Aka missing route to 192.168.32.0

dnilgreb · September 29, 2024, 3:13pm

Wow that was quick. And extensive too. Thanks! Let me try to reply to all the parts:

From the Scale, I cannot ping the Core server. However, from the OpenVPN VM, I can.

Not quite. The Core has 10.8.0.20. 10.8.0.1 is the OpenVPN server VM. It is using tun0.
A ping to 10.8.0.2 gives no response. Not being used. At all.

Yes, there is a static route setup in my router (USG) over to the 10.8.0.0/24 network.

Yes. When connected to OpenVPN of course.

On the Core side, there is no routing back to my 192.168.32.0/24 network. But there has never been, and it has worked for years. The only thing that has changed is that I upgraded the feature flags on the Scale side. Could this really be the cause?

Routing table on Core:

Internet:
Destination        Gateway            Flags     Netif Expire
default            192.168.0.1        UGS         em0
10.8.0.1           link#4             UH         tun0
10.8.0.20          link#4             UHS         lo0
127.0.0.1          lo0                UHS         lo0
192.168.0.0/24     link#1             U           em0
192.168.0.20       link#1             UHS         lo0

ip a on Scale:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: eno1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel master br0 state UP group default qlen 1000
    link/ether 64:51:06:50:8c:84 brd ff:ff:ff:ff:ff:ff
    altname enp0s25
    inet6 fe80::6651:6ff:fe50:8c84/64 scope link
       valid_lft forever preferred_lft forever
3: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 92:15:78:b0:bf:23 brd ff:ff:ff:ff:ff:ff
    inet 192.168.32.167/24 brd 192.168.32.255 scope global br0
       valid_lft forever preferred_lft forever
4: vnet0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UNKNOWN group default qlen 1000
    link/ether fe:a0:98:42:c7:56 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fca0:98ff:fe42:c756/64 scope link
       valid_lft forever preferred_lft forever
5: vnet1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UNKNOWN group default qlen 1000
    link/ether fe:a0:98:2a:81:85 brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fca0:98ff:fe2a:8185/64 scope link
       valid_lft forever preferred_lft forever
7: vnet3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master br0 state UNKNOWN group default qlen 1000
    link/ether fe:a0:98:0b:79:0a brd ff:ff:ff:ff:ff:ff
    inet6 fe80::fca0:98ff:fe0b:790a/64 scope link
       valid_lft forever preferred_lft forever

Not sure how to accompish that on the Core?

Did it clarify anything?

Bingo600 · September 29, 2024, 3:14pm

TLDR - Only relvant if you use any 192.168.32.x ip adresses to access Core.

Here’s the OpenVPN doc on :

Expanding the scope of the VPN to include additional machines on either the client or server subnet.

In your server.conf (Scale) , you would need to “push unknown lans” to the client

As seen in the example server conf here

github.com

OpenVPN/openvpn/blob/master/sample/sample-config-files/server.conf

#################################################
# Sample OpenVPN 2.6 config file for            #
# multi-client server.                          #
#                                               #
# This file is for the server side              #
# of a many-clients <-> one-server              #
# OpenVPN configuration.                        #
#                                               #
# OpenVPN also supports                         #
# single-machine <-> single-machine             #
# configurations (See the Examples page         #
# on the web site for more info).               #
#                                               #
# This config should work on Windows            #
# or Linux/BSD systems.  Remember on            #
# Windows to quote pathnames and use            #
# double backslashes, e.g.:                     #
# "C:\\Program Files\\OpenVPN\\config\\foo.key" #
#                                               #
# Comments are preceded with '#' or ';'         #

This file has been truncated. show original

 # Push routes to the client to allow it
# to reach other private subnets behind
# the server.  Remember that these
# private subnets will also need
# to know to route the OpenVPN client
# address pool (10.8.0.0/255.255.255.0)
# back to the OpenVPN server.
;push "route 192.168.10.0 255.255.255.0"
;push "route 192.168.20.0 255.255.255.0"

Edit : Remember to remove the semicolon on the push commands to “uncomment”

Bingo600 · September 29, 2024, 3:24pm

My bad
I totally forgot you had a VM running the OVPN Server.

Lets start over …

So Scale (192.168.32.167) is supposed to communicate with Core (10.8.0.20) via the OVPN VM (10.8.0.1) on those excact ip addresses … Correct ?

Edit Scale if = br0 , is Scale a VM , or doe TrueNas also uses bridges as main ether IF’s ?
I’m a super TrueNAS beginner … Hopefully to install my machine next week

I’m no FreeBSD Guru , but i can get a route table on my pfSense (FreeBSD)with this command : netstat -ar

And an interface list with this one: ifconfig

And a routetable and interface list on the OVPN VM would be nice too …
With Core connected to the OVPN server. Well everytime i mention route tables, it should be with Core connected to the OVPN Server VM.

You mention that 10.8.0.2 isn’t used at all , but it clearly shows that it’s used as the gw to reach 10.8.0.0/24 on your OVPN Server route table

dnilgreb · September 30, 2024, 4:50am

Correct.

Scale is not virtualized. The main network adapter is bridged.

netstat -ar on Core:

Internet:
Destination        Gateway            Flags     Netif Expire
default            minrouter.home     UGS         em0
10.8.0.1           link#4             UH         tun0
10.8.0.20          link#4             UHS         lo0
localhost          lo0                UHS         lo0
192.168.0.0/24     link#1             U           em0
192.168.0.20       link#1             UHS         lo0

and ifconfig on Core:

em0: flags=8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> metric 0 mtu 1500
        options=81249b<RXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,VLAN_HWCSUM,LRO,WOL_MAGIC,VLAN_HWFILTER>
        ether dc:4a:3e:73:bd:d6
        inet 192.168.0.20 netmask 0xffffff00 broadcast 192.168.0.255
        media: Ethernet autoselect (1000baseT <full-duplex>)
        status: active
        nd6 options=1<PERFORMNUD>
lo0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> metric 0 mtu 16384
        options=680003<RXCSUM,TXCSUM,LINKSTATE,RXCSUM_IPV6,TXCSUM_IPV6>
        inet6 ::1 prefixlen 128
        inet6 fe80::1%lo0 prefixlen 64 scopeid 0x2
        inet 127.0.0.1 netmask 0xff000000
        groups: lo
        nd6 options=21<PERFORMNUD,AUTO_LINKLOCAL>
pflog0: flags=0<> metric 0 mtu 33160
        groups: pflog
tun0: flags=8051<UP,POINTOPOINT,RUNNING,MULTICAST> metric 0 mtu 1500
        options=80000<LINKSTATE>
        inet 10.8.0.20 --> 10.8.0.1 netmask 0xffffff00
        groups: tun
        nd6 options=1<PERFORMNUD>
        Opened by PID 1518

routetable from ovpn server:

Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
default         USG             0.0.0.0         UG    100    0        0 ens3
10.8.0.0        10.8.0.2        255.255.255.0   UG    0      0        0 tun0
10.8.0.0        0.0.0.0         255.255.255.0   U     0      0        0 tun0
192.168.32.0    0.0.0.0         255.255.255.0   U     100    0        0 ens3
USG             0.0.0.0         255.255.255.255 UH    100    0        0 ens3

ip a on ovpn server:

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host noprefixroute
       valid_lft forever preferred_lft forever
2: ens3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 00:a0:98:0b:79:0a brd ff:ff:ff:ff:ff:ff
    altname enp0s3
    inet 192.168.32.42/24 metric 100 brd 192.168.32.255 scope global dynamic ens3
       valid_lft 53767sec preferred_lft 53767sec
    inet6 fe80::2a0:98ff:fe0b:790a/64 scope link
       valid_lft forever preferred_lft forever
3: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
    link/none
    inet 10.8.0.1/24 scope global tun0
       valid_lft forever preferred_lft forever
    inet6 fe80::ef7d:6372:39b3:3945/64 scope link stable-privacy
       valid_lft forever preferred_lft forever

If 10.8.0.2 is used, it must be something that openvpn does and/or needs on the server. There is no client with that IP. Is it incorrect and should be removed? That was my feeling.

I don´t think I should have to change my OpenVPN server config, as it has worked until I did tha pool upgrade on the Scale. The Core has a certificate installed, so there is no “additional machines”. Am i understanding this correctly?

Bingo600 · September 30, 2024, 1:46pm

Your core doesn’t have a route to the “Scale LAN - 192.168.32.0/24”
That makes it impossible for Core to send packages via OVPN to ANY device on your 192.168.32.0/24 LAN (Ie. Scale or Laptop).
Core will send packages destined for 192.168.32.0/24 to the Core def-gw.

I would suggest you add this command to the “server.conf” … Or whatever you have named your OVPN config file on the OpenVPN VM machine

push "route 192.168.32.0 255.255.255.0"

That should tell the OpenVPN Server to “announce” the 192.168.32.0/24 net as being reachable via the OVPN Server.

After doing that , and restarting OVPN on the VM.
You should on subsequent Core OpenVPN connects (any clients connect) , see the 192.168.32.0/24 net in the clients routing table , as being reachable via some 10.8.0.x IP address.

If you don’t want to change any config’s. You will not fix this.
Core needs to know the route back to 192.168.32.0/24.

dnilgreb · September 30, 2024, 4:42pm

Ok, so I can´t explain this.

I hopped onto my remote Core server and ran

traceroute 192.168.32.1

and saw that it took the internet route instead of using the OpenVPN one.

Then I went into my server.conf on my OpenVPN server and confimred that I already had that line that pushes 192.168.32.0/24. It was already there, I haven´t added or changed anything.

Then I went back to my Core server and restarted the OpenVPN Client service.
Now it works.

I have no idea why, but I´m just glad order is restored.
Thank you so much for your help.

Oh, and yes. The routing table on Core looks like this now:

Internet:
Destination        Gateway            Flags     Netif Expire
0.0.0.0/1          10.8.0.1           UGS        tun0
default            minrouter.home     UGS         em0
10.8.0.0/24        10.8.0.1           UGS        tun0
10.8.0.1           link#4             UH         tun0
10.8.0.20          link#4             UHS         lo0
90.224.50.99/32    minrouter.home     UGS         em0
localhost          lo0                UHS         lo0
128.0.0.0/1        10.8.0.1           UGS        tun0
192.168.0.0/24     link#1             U           em0
192.168.0.20       link#1             UHS         lo0
192.168.32.0/24    10.8.0.1           UGS        tun0

Bingo600 · September 30, 2024, 4:49pm

Glad it worked for you

On an afterthought … I would have expected that restarting OpenVPN in “both ends” had already been done. Especially the Client end , since the server end was modified.

My bad that i didn’t suggest that.

Well now you have an idea about how2 debug a routing issue on both Scale, OVPN VM & Core.

Btw: You could mark this on as solved.