TCP connection to Oracle custom app keeps dropping on my LAN. Why?

I just installed Oracle XE on my TrueNAS server. The only odd thing is the connection between SQL Developer and my app keeps closing after around 30 minutes of idle time. Oracle doesn’t do it (I checked the configs) and SQL Developer has no TCP timeout.

My connection to Oracle XE on another server (not TrueNAS) stays up.

So I am guessing that it has to be something about k3s closing idle connections is my guess. There is no setting in Truenas apps advanced settings that control this.

I could install a keepalive into SQL Developer, but I’m guessing this might be a generic problem.

Am I right that this is a k3s issue or is there some other setting in TrueNAS scale that governs TCP long lasting connections?

First I’ve seen this issue. TruneAS just uses Debian defaults AFAIK.

The “application” is supposed to request TCP keepalives.
https://tldp.org/HOWTO/TCP-Keepalive-HOWTO/usingkeepalive.html

Unless someone else is experiencing the same issue or knows the right config fix, I’d suggest installing the keepalive into SQL developer and reporting back.

1 Like

Wow thx. amazed at how far we have come regarding the tcp stack.

I can almost remember when tcp wrappers were the rage.

Is it good or bad?
Depends.

Here are the truenas intervals before a connection is closed:

net.ipv4.tcp_keepalive_probes = 9
stk@truenas:~$ sudo sysctl net.ipv4.tcp_keepalive_intvl
net.ipv4.tcp_keepalive_intvl = 75
stk@truenas:~$ sudo sysctl net.ipv4.tcp_keepalive_time
net.ipv4.tcp_keepalive_time = 7200

So it will send the probes around once a minute and wait 2 hours for a response. After 9 consecutive fails, it will close the connection.

So if I’m reading it right, a connection will stay open for 2 hours and if nobody answers, it will close the connection.

Right?

This doesn’t explain who is closing the connection to my Oracle custom app on TrueNAS since it is closing way faster than 1 hour idle.

After a connection is 1st made, its stays up for 7200 seconds.

After that it will go down with 9 x 75 second probes missing… about 12 minutes.
Any packet drops on the network?

1 Like

I’m using the same SQL server docker image on two systems and accessing from SQL server.

The connection drops in <30 minutes to truenas, but doesn’t on the natively hosted docker image.

This can mean only one thing: Is k3s doing anything on connection management to pods?

You could try setting up the docker image in a Sandbox to see if the issue is k3s

2 Likes

Oh my gosh. That is a GREAT idea. I had no clue sandboxes even existed in Scale. THANK YOU! It takes only a few minutes to set this up.

One of the YouTube comments: “This might well be THE BEST tech tutorial I have ever used. Well done, and thank you so much!”

One problem… install in the video simply doesn’t work:

1 Like

See video description.

Jlmkr 2.0 doesn’t have install function. Instead make an alias.

Why is that the only possible reason?
You have two distinct systems, all that hardware is adding plenty of variables that could be different, even the cabling or possible switch could be a contributing factor.

ok, it’s the most likely reason since my local system/network isn’t known to be dropping connections like that and it ONLY happens with this one application

NO.

Bad cables never have the “same exact timing”!
Bad hardware would “add the same variables” to other connections and it would not be at exact 30 min, right?
Drop packets, unless severe, will not cause disconnects, that’s why the TCP protocol has checksums and resend.

This has to be a software setting somewhere to disconnect after exactly 30 min.

If you can provide the files, I could test/reproduce/check it as well.

Can you setup the environment under a REAL VM?
That would answer questions and the usual ignorant guesses.

I’m going to bring XE up in a jail on the same box and compare.

if the results are the same, then I’ll do the VM.

1 Like
  • There’s no need to shout, also kindly drop the dismissive and abrasive language intermingled in your post(s).
  • You’re misquoting me. Not cool.
  • The OP never said it was exactly 30 minutes.

Thanks for the primer on how TCP works.
My post wasn’t related to TCP though; it was made to point out that focusing on it being an issue with TrueNAS seems premature since there are more variables at play. It’s good to keep an open mind while troubleshooting.

The software stack goes deeper than TN and there can be misconfigurations at any level. It’s not always going to be down to something being physically broken, even though a faulty cable can play a part, on occasion.

1 Like

OK, I tested it using 3 oracle servers (including 2 on the truenas box, one under k3s (using strictly the truenas app interface) and the other under jailmaker.

The ONLY server closing the TCP connections within 30 minutes is the custom app I created from the official oracle image. ALL server used the oracle official docker image. no customizations were made.

So this narrows it down. Jailmaker Oracle deploy works as expected, the truneas app deploy created using the Custom App button, closes the TCP connection prematurely.

This seems to confirm my suspicion that something in k3s networking is closing idle connections.

I’m not a k3s expert, but does this sound like a k3s configuration issue?

I suspect you are right. According to discussion here, there are some security issues with making changes.

Did you try the keepalive in SQL Developer?

If not, I’d suggest using the Sandbox until the Electric Eel Docker version is available.

1 Like

there are keepalive add ons to sql developer.

but I ended up just using a jail on truenas scale running docker with an oracle container and a tailscale container.

Software culprit, as stated.