New setup / installiation (about 2 weeks now)
Dragonfish-24.04.2.2 (scale)
AMD Ryzen 7 PRO 8700G
Asus rog strix b650e i gaming wifi
2xKSM56E46BD8KM-48HM
6xWD SA500 2TB pool Z2
1xWD SA500 1TB boot
1xHBA 9500-8i Tri-Mode (IT mode)
Corsair 750W PSU
I hate to say “randomly” but I can’t be sure about this, anyway, from time to time, the system hangs, it doesn’t respond to ping (even though the LED on the NIC blink) and the only way to make it work again is via a hard reset. There’s no picture from the console.
I’m having this issue since the beginning, after testing the hardware, I installed Photoprism (the only one I’m using).
I would say the app is the problem because the same behaviour happened when I was indexing my pictures. Interestingly enough, the CPU never went over 35% and RAM was around 3GB. Right now Photoprism indexed everything and is doing nothing, but still the issue persists.
I can’t find any log before the hard reset and the only thing I can tell you is I’m 99% sure it is not an issue with temps. I ran memtest and prime95 for 24h and 10h and the max temp was around 75° on the cpu (power limited to 65W).
Another thing to mention, it seems I cannot fully shutdown the system. If I click on shutdown, the system shuts down but fans keep spinning, no idea if this is related or not.
IMHO i don’t think that photoprism can be the issue itself, most probably is just impacted… But if you wanna be 100% sure just stop It for a day and see what happen
thanks for your reply. I’m running the latest available BIOS from ASUS and I have no power saving features active.
the problem is, the system was running fine for 3 days (after indexing with crash) and today suddenly crashed. I can stop the app but I’m not sure it is consistent test.
I can reproduce the error by running a full re-index of photoprism.
Take this with a grain of salt, but i didn’t find logs in my photoprism datasets directly (prob i should have map outside my container)… But for sure you can see something via his GUI, or via charts directly somehow (sorry im a Core user don’t know how Scale works there).
Probably you should set log level debug tò see anything interesting:
Regarding TN log, via shell should be more /var/log/messages
So apparently your system hangs on reboot, but Is rock solid on testing RAM/CPU… Maybe Is worth try to remove much parts you can, and check if something strange Is happening (in my past experience, had issue with the NIC drivers that issue randoms reboot, at this point better not to overlook anything)
Well the debug log stops recording right before the crash. It stops at random pictures so this is not the cause. Interesting is I just ran another indexing (not full) and it worked… No crash so far.
/Var/log/messages is basically empty, it stops right before the crash and the last message was always random and just informative.
It’s very hard for me to understand where logs of photo prism are, also the container config file…
So 96GB and 16 assigned to the app, but it never goes to that level. I would like to change the mariadb cache (posted another post here) but I can’t find the config file.
Still, the fact I can’t shutdown the system is very strange to me.
This happen only with TN? Or with another OS system gracefully shutdown?
In case, just install some light OS on USB and try (like alpine linux).
Got same sympthone on an old mobo, Is not a good sign if happen with every OS, otherwise something Is wrong with your TN (and a fresh install + config upload can be a thing to try)
Sad… without log you can only troubleshout on tent, and test system stability with less parts possibile Is the only way i know… Or maybe submit a support ticket with the debug file
Yep on both windows and Linux works just fine. I will probably reset the bios just to be sure and rerun memtest and prime95 once again. It costs nothing.
That it shuts down on Linux doesn’t necessarily mean much, sometimes the exact kernel version is the issue. Generally, it’s a BIOS or hardware issue that causes shutdown issues. That it does shutdown but your fans keep spinning is really suspect.
I’ve seen all sorts of issues with gaming machines and Linux over the years. I’ve had several myself. Just various odd incompatibilities, esp with shutdown. It can depend on BIOS, just the motherboard, maybe linux kernel version or what is included in it, etc. I doubt consumer hardware like gaming motherboards get extensive linux testing. I’m sure they get extensive Windows testing.
I do not know what to test or do. Just giving my observations. I also seriously doubt photoprism is causing your issues. My guess is merely bios/drivers. But it’s just a best guess. Maybe someone else has better ideas about a course of action.
So here’s a small update. After running memtest and prime95 (all good) I did the most intelligent thing you can do when you have no clues about the problem. I upgraded to the beta RC ElectricEel-24.10-RC.1
Deleted the app and set-up docker with photoprism.
3 small folders indexed perfectly, I then decided to index my entire collection and it crashed. Same behavior.
I currently have no time to test a couple of more things, will do in the next few days.