Very slow WebUI - Login, Apps, etc. SCALE Dragonfish RC

Jorsher · April 9, 2024, 4:25pm

Forums have changed since last time I was here…

I’ve installed Dragonfish RC fresh (not an upgrade). I’m currently running 38 containers. Now that I have everything set up how I’d like it, the TrueNAS GUI seems to become unusually slow. Logging into the GUI takes a minute or two of spinning before I get to the dashboard. When the dashboard finally loads, if I try to go to apps – it occasionally loads up in less than a minute, while other times spins for a minute or two before giving an error…I accidentally overwrote the trace that on the clipboard, but will update when I get it again… Sometimes things load for so long, the session is discarded and I have to log in again.

When I check htop, I don’t see anything that immediately strikes me as a problem. kuberouter/kubeserver are sometimes the top CPU users, but that’s expected. There are plenty of CPU/RAM resources available when the GUI is slow. Even once I stop all the containers, the GUI is slow. I’ve noticed ‘asyncio_loop/middlewared’ is usually the top process at 100% CPU when I’m experiencing this. Sometimes I get random alerts like ‘Maximum number of concurrent calls (20) has exceeded’ and ‘Failed to check for alert Quota: Failed connection handshake’ and ‘Failed to check scrub paused status’ (something like that).

I’m confident it’s not a hardware limitation and I don’t remember these issues with Cobia. Any suggestions on where to look to help narrow down the problem?

awalkerix · April 9, 2024, 4:48pm

How many authenticated sessions middleware sessions do you have midclt call auth.sessions | jq?

What are on middleware thread stacks when it is hanging? midclt call core.threads_stacks | jq

Jorsher · April 9, 2024, 4:54pm

Thanks.

midclt call auth.sessions | jq shows 22 sessions

midclt call core.threads_stacks | jq shows 46, mostly:

Blockquote
“140288890742464”: [
" File "/usr/lib/python3.11/threading.py", line 995, in _bootstrap\n self._bootstrap_inner()\n",
" File "/usr/lib/python3.11/threading.py", line 1038, in _bootstrap_inner\n self.run()\n",
" File "/usr/lib/python3.11/threading.py", line 975, in run\n self._target(*self._args, **self._kwargs)\n",
" File "/usr/lib/python3.11/concurrent/futures/thread.py", line 81, in _worker\n work_item = work_queue.get(block=True)\n"

I should add that I ‘stopped’ all containers, which certainly helps, but still not running as smoothly as if it was a fresh reboot without any containers running.

Just got this random alert:
“CRITICAL: Failed to check for alert HadUpdate:”

Just tried to load the Apps page and got:

Blockquote
Error: concurrent.futures.process._RemoteTraceback:
“”"
Traceback (most recent call last):
File “/usr/lib/python3.11/concurrent/futures/process.py”, line 256, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/worker.py”, line 112, in main_worker
res = MIDDLEWARE._run(*call_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/worker.py”, line 46, in _run
return self._call(name, serviceobj, methodobj, args, job=job)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/worker.py”, line 34, in _call
with Client(f’ws+unix://{MIDDLEWARE_RUN_DIR}/middlewared-internal.sock’, py_exceptions=True) as c:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/client/client.py”, line 296, in init
raise ClientException(‘Failed connection handshake’)
middlewared.client.client.ClientException: Failed connection handshake
“”"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 198, in call_method
result = await self.middleware.call_with_audit(message[‘method’], serviceobj, methodobj, params, self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1466, in call_with_audit
result = await self._call(method, serviceobj, methodobj, params, app=app,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1417, in _call
return await methodobj(*prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 47, in nf
res = await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 187, in nf
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/chart_releases_linux/chart_release.py”, line 115, in query
catalogs = await self.middleware.call(‘catalog.query’, , {‘extra’: {‘item_details’: True}})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1564, in call
return await self._call(
^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1417, in _call
return await methodobj(*prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 47, in nf
res = await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 187, in nf
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/service/crud_service.py”, line 166, in query
result = await self.middleware.call(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1564, in call
return await self._call(
^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1417, in _call
return await methodobj(*prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 187, in nf
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/datastore/read.py”, line 149, in query
result = await self._queryset_serialize(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/datastore/read.py”, line 197, in _queryset_serialize
extend_context_value = await self.middleware.call(extend_context, rows, extra_options)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1564, in call
return await self._call(
^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1417, in _call
return await methodobj(*prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/catalogs_linux/update.py”, line 73, in catalog_extend_context
catalogs_ds = await self.middleware.call(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1564, in call
return await self._call(
^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1425, in _call
return await self._call_worker(name, *prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1431, in _call_worker
return await self.run_in_proc(main_worker, name, args, job)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1337, in run_in_proc
return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1321, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
middlewared.client.client.ClientException: Failed connection handshake

Refresh and I’m back to the login screen, although it hasn’t been long enough to expire. Log in and the Apps page loads reasonably fast.

Just tried to load “Storage” page and get a similar error:

Blockquote
Error: concurrent.futures.process._RemoteTraceback:
“”"
Traceback (most recent call last):
File “/usr/lib/python3.11/concurrent/futures/process.py”, line 256, in _process_worker
r = call_item.fn(*call_item.args, **call_item.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/worker.py”, line 112, in main_worker
res = MIDDLEWARE._run(*call_args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/worker.py”, line 46, in _run
return self._call(name, serviceobj, methodobj, args, job=job)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/worker.py”, line 34, in _call
with Client(f’ws+unix://{MIDDLEWARE_RUN_DIR}/middlewared-internal.sock’, py_exceptions=True) as c:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/client/client.py”, line 296, in init
raise ClientException(‘Failed connection handshake’)
middlewared.client.client.ClientException: Failed connection handshake
“”"
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 198, in call_method
result = await self.middleware.call_with_audit(message[‘method’], serviceobj, methodobj, params, self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1466, in call_with_audit
result = await self._call(method, serviceobj, methodobj, params, app=app,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1417, in _call
return await methodobj(*prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 47, in nf
res = await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 187, in nf
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/service/crud_service.py”, line 166, in query
result = await self.middleware.call(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1564, in call
return await self._call(
^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1417, in _call
return await methodobj(*prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 187, in nf
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/datastore/read.py”, line 149, in query
result = await self._queryset_serialize(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/datastore/read.py”, line 201, in _queryset_serialize
return [
^
File “/usr/lib/python3/dist-packages/middlewared/plugins/datastore/read.py”, line 202, in
await self._extend(data, extend, extend_context, extend_context_value, select)
File “/usr/lib/python3/dist-packages/middlewared/plugins/datastore/read.py”, line 215, in _extend
data = await self.middleware.call(extend, data, extend_context_value)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1564, in call
return await self._call(
^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1428, in call
return await self.run_in_executor(prepared_call.executor, methodobj, *prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1321, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3.11/concurrent/futures/thread.py”, line 58, in run
result = self.fn(*self.args, **self.kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/middlewared/plugins/pool/pool.py", line 185, in pool_extend
pool[‘is_upgraded’] = self.middleware.call_sync(‘pool.is_upgraded_by_name’, pool[‘name’])
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1586, in call_sync
return self.run_coroutine(methodobj(*prepared_call.args))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1626, in run_coroutine
return fut.result()
^^^^^^^^^^^^
File “/usr/lib/python3.11/concurrent/futures/_base.py”, line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3.11/concurrent/futures/_base.py”, line 401, in __get_result
raise self.exception
File "/usr/lib/python3/dist-packages/middlewared/plugins/pool/info.py", line 167, in is_upgraded_by_name
return await self.middleware.call(‘zfs.pool.is_upgraded’, name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1564, in call
return await self._call(
^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1425, in _call
return await self._call_worker(name, *prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1431, in _call_worker
return await self.run_in_proc(main_worker, name, args, job)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1337, in run_in_proc
return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1321, in run_in_executor
return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
middlewared.client.client.ClientException: Failed connection handshake

Tried restarting middlewared

root@truenas:/mnt/home# service middlewared stop
2024 Apr 9 20:17:20 truenas Warning via /usr/local/libexec/smart_alert.py to root produced unexpected output (434 bytes) to STDOUT/STDERR:
2024 Apr 9 20:17:20 truenas Traceback (most recent call last):
2024 Apr 9 20:17:20 truenas File “/usr/local/libexec/smart_alert.py”, line 30, in
2024 Apr 9 20:17:20 truenas main()
2024 Apr 9 20:17:20 truenas File “/usr/local/libexec/smart_alert.py”, line 20, in main
2024 Apr 9 20:17:20 truenas with Client() as c:
2024 Apr 9 20:17:20 truenas ^^^^^^^^
2024 Apr 9 20:17:20 truenas File “/usr/lib/python3/dist-packages/middlewared/client/client.py”, line 296, in init
2024 Apr 9 20:17:20 truenas raise ClientException(‘Failed connection handshake’)
2024 Apr 9 20:17:20 truenas middlewared.client.client.ClientException: Failed connection handshake
2024 Apr 9 20:17:20 truenas Warning via /usr/local/libexec/smart_alert.py to root: failed (32-bit/8-bit exit status: 256/1)
root@truenas:/mnt/home# service middlewared start

Things improved, but still not “normal.”

Even though I stopped all containers, there were a ton of threads for k3s-related processes. “service k3s stop” seemed to help.

Will reboot in a moment

Curious if it can be some sort of fileopen/socket issue? The containers I run create a large number of connections and open a large number of files. Just spit-ballin… Seems to be something related to the containers + middlewared

After a reboot, everything is snappy. In 1-2 days, the GUI will likely be pretty bad again.

Er, after a few hours… Slow again.

Failed to check for alert ScrubPaused: Failed connection

Jorsher · April 10, 2024, 5:30pm

Some sort of resource exhaustion.

root@truenas:~# midclt call auth.sessions | jq
Traceback (most recent call last):
File “/usr/bin/midclt”, line 12, in
sys.exit(main())
^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/client/client.py”, line 616, in main
with Client(uri=args.uri) as c:
^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/client/client.py”, line 296, in init
raise ClientException(‘Failed connection handshake’)
middlewared.client.client.ClientException: Failed connection handshake
root@truenas:~# midclt call core.threads_stacks | jq
Traceback (most recent call last):
File “/usr/bin/midclt”, line 12, in
sys.exit(main())
^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/client/client.py”, line 616, in main
with Client(uri=args.uri) as c:
^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/client/client.py”, line 296, in init
raise ClientException(‘Failed connection handshake’)
middlewared.client.client.ClientException: Failed connection handshake

If I’m lucky, I can get to the GUI login, but never beyond it. SSH works, fortunately, but will need a reboot again…

nickspacemonkey · April 11, 2024, 1:49pm

This happens to me occasionally. I’ve found that restarting middleware over SSH usually resolves everything for me.

systemctl restart middlewared.service

Jorsher · April 11, 2024, 7:01pm

Will try next time. Right now it seems to be ‘good.’ There was a lot more network activity with the containers when it was performing poorly. Perhaps sockets/ports were exhausted and led to the ‘unable to connect/complete handshake’ errors I was getting from TrueNAS?

Unsure if this is related, but I was trying to set up an nspawn container and it was failing to start. journalctl was showing inotify had too many open files. max_files was already a huge number, so I tried increasing max_user_watches and it didn’t help. When I increased “fs.inotify.max_user_instances” from the default 128 to 256, the container was able to start.

nickspacemonkey · April 12, 2024, 9:43am

Not really sure honestly. I have only 3 nspawn containers running, so not nearly as many as you.

Jorsher · April 18, 2024, 3:28pm

I have to increase inotify watch from 128 to avoid some issues, but in the end I still run into issues with the middleware. At some point the web interface won’t load, and when I try to restart the service:

Apr 18 18:18:18 truenas systemd[1]: middlewared.service: Found left-over process 2243022 (asyncio_loop) in control group while starting unit. Ignoring.
Apr 18 18:18:18 truenas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Apr 18 18:18:18 truenas systemd[1]: middlewared.service: Found left-over process 2243044 (python3) in control group while starting unit. Ignoring.
Apr 18 18:18:18 truenas systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Apr 18 18:18:18 truenas systemd[1]: middlewared.service: Will not start SendSIGKILL=no service of type KillMode=control-group or mixed while processes exist
Apr 18 18:18:18 truenas systemd[1]: middlewared.service: Failed to run ‘start’ task: Device or resource busy
Apr 18 18:18:18 truenas systemd[1]: Starting middlewared.service - TrueNAS Middleware…
░░ Subject: A start job for unit middlewared.service has begun execution
░░ Defined-By: systemd
░░ Support:
░░
░░ A start job for unit middlewared.service has begun execution.
░░
░░ The job identifier is 55899.
Apr 18 18:19:48 truenas systemd[1]: middlewared.service: State ‘stop-sigterm’ timed out. Skipping SIGKILL.
Apr 18 18:21:18 truenas systemd[1]: middlewared.service: State ‘final-sigterm’ timed out. Skipping SIGKILL. Entering failed mode.
Apr 18 18:21:18 truenas systemd[1]: middlewared.service: Failed with result ‘resources’.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support:
░░
░░ The unit middlewared.service has entered the ‘failed’ state with result ‘resources’.
Apr 18 18:21:18 truenas systemd[1]: middlewared.service: Unit process 2243022 (asyncio_loop) remains running after unit stopped.
Apr 18 18:21:18 truenas systemd[1]: middlewared.service: Unit process 2243044 (python3) remains running after unit stopped.
Apr 18 18:21:18 truenas systemd[1]: Failed to start middlewared.service - TrueNAS Middleware.
░░ Subject: A start job for unit middlewared.service has failed
░░ Defined-By: systemd
░░ Support:
░░
░░ A start job for unit middlewared.service has finished with a failure.
░░
░░ The job identifier is 55899 and the job result is failed.
Apr 18 18:21:18 truenas systemd[1]: middlewared.service: Consumed 12h 7min 9.128s CPU time.
░░ Subject: Resources consumed by unit runtime
░░ Defined-By: systemd
░░ Support:
░░
░░ The unit middlewared.service completed and consumed the indicated resources.

Unsure what resource limitation it’s running into. There’s an excessive amount of CPU and RAM available, so it must be another ‘soft’ limit.

Jorsher · April 21, 2024, 8:14am

For some reason, I haven’t experienced this issue for the last couple days (2 days, 16 hours). I’m not sure what triggered the issue to start, or what caused it to go away.

At the time I was experiencing the problems, I was moving 170TB of small files between pools with rsync. The rsync task moved millions of files and directories. Around 2-3 days ago, I started another rsync job with a similar amount of data, but the files are much much larger and fewer in number.

Wonder if the large number of files was the cause, and perhaps that’s why nobody else seems to have encountered it. Some of the errors were related to iwatch.

Jorsher · April 21, 2024, 7:14pm

Started up a new rsync job from CLI and … well the web ui is inaccessible again.

So, I guess this is resolved in that I know what causes the issue, but not sure what to do about it beyond waiting for rsync to finish.

Stux · April 22, 2024, 12:05am

Have a look at top in the shell while it’s occurring

Are you swapping? Is memory depleted?

nickspacemonkey · April 22, 2024, 2:44am

For sure I’m not swapping. There is a bug of some kind.

cmplieger · April 24, 2024, 12:18pm

Have the same issue, after 30 min webui and ssh slow to a crawl. have destroyed my app pool and recreated, issue persists. Think i need to re-install…

dasunsrule32 · April 24, 2024, 3:04pm

I’m not seeing the UI lockup and becoming inaccessible, but I’m seeing high CPU usage. Restarting the middleware works around the issue, but it’s causing my CPU to run hot because of the issue.

Going to submit a report today.

dasunsrule32 · April 24, 2024, 4:50pm

Ok, so far what I’m seeing is that the /ui/apps page now has live cpu, memory and network metrics. When I leave the page open on that site in the UI, it creates a TON of CPU usage. If I close it down, it stops.

It’s also under any page in the /ui/apps directory. When I go under Discover Apps, even though the metrics are not displayed, the CPU is still spun up.

It would be nice to have a toggle to disable that or set the update period for it. It’s probably set too aggressively.

dasunsrule32 · April 24, 2024, 5:27pm

I opened a ticket for this issue.

mooglestiltzkin · April 24, 2024, 6:53pm

i also noticed some weird slowdowns in sign in and the truenas UI.

it’s random. sometimes it happens, sometimes it doesn’t.

no idea why but i did note the similarities as josh. also is a new fresh install. Though i did upgrade to the recent versopm somce RC1.

nickspacemonkey · April 25, 2024, 6:34pm

Things have improved massively since the release of Dragonfish for me. I got some slowdowns while rsyncing my music library, but at least I didn’t need to restart middleward.

rhkenji · April 29, 2024, 11:19pm

I just upgraded to Dragonfish and I have the same issue. I also am getting

Failed to check for alert Quota: Failed connection handshake
Alerts on top of it

dasunsrule32 · April 30, 2024, 10:33pm

Nah, you just likely aren’t in the /ui/apps page. That is where most the high cpu usage happens. I think they are polling k3s metrics every second on that page, which is causing some of the cpu spikage. Regardless, it’s been confirmed in my ticket above and looks to be targeted to fixed in upcoming point release for dragonfish. I also see it in the cpu metrics page when I have auto refresh enabled.