Random reboot on a new installation

Hi, I have recently installed Truenas on bare metal transitioning from proxmox. Here are my system details:

System Details:
Version: TrueNAS Fangtooth beta 25.04-BETA.1
CPU: 6 x Intel(R) Core™ i5-9600K CPU @ 3.70GHz (1 Socket)
Motherboard: ASRock H370M-HDV
RAM: Corsair DDR4 Vengeance LPX 2x32GB 2666MHz (64GB total)
HBA: LSI SAS 9207-8i
HDD: 6 x WD DC HC520 12TB (RAIDZ2 in TrueNAS)
Power Supply: Corsair SF750
Network Adapter: Intel NIC

I observed a random reboot after 3 hours of operation and nothing was logged in journalctl. But on reboot, journaltctl did mention these mce errors

mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 0: b200000000030005
mce: [Hardware Error]: TSC 0
mce: [Hardware Error]: PROCESSOR 0:906ed TIME 1741343606 SOCKET 0 APIC 6 microcode 100
microcode: Current revision: 0x000000100
microcode: Updated early from: 0x000000ea

Any idea on what this error is about and what component of my system is affected?

I also see these error messages in journalctl.

1. x86/cpu: SGX disabled or unsupported by BIOS
2. ipa-epn.timer: Refusing to start, unit ipa-epn.service to trigger not loaded.
failed to start ipa-epn.timer - Execute IPA Expiring Password Notification (EPN) every day at 1AM.
3. mpt2sas_cm0: overriding NVDATA EEDPTagMode setting

Run Memtest86 or Memtest86+ for at least one complete pass, in your situation I would let it run for a few days to ensure testing is extra good.

Then I would run a CPU stress test like Prime95 or similar for probably 6 hours to stress the CPU, if it gets hot, it have thermal throttling to protect the CPU.

I can’t really help with the second set of messages you posted, but let’s work on getting a stable system first.

And a question:
You said you were running Proxmox. Can you be a bit more descriptive, how long were you running it, was TrueNAS being run on top of it, have you tried TrueNAS 24.10.2 yet to see if it is stable? Remember that 25.04-BETA.1 is BETA and it could be a bug, need to validate it against known working code. BUT first, run those tests.

Edit: Just checked, your RAM is not ECC. Just a comment.

1 Like

Thank you for the reply. Sure, I will run these tests and update here soon.

As for Proxmox, I was running Truenas as a VM in Proxmox and I switched to Truenas on bare metal now, as the support for lxc is coming.

I have installed 24.10.2 but immediately upgraded to 25.04 Beta.

Edit: I did run Memtest86+ for more than 12 hours (atleast 8 passes) a month ago and it did pass. I will do it again.

Also, I have started the stress-ng on all cpu cores in a container for 8 hours - passed without any crashes.

Update -

  1. I did perform the stress-ng for 8 hours, and there were no crashes in that period.
  2. Memtest86 is running for 10 hours and completed two passes with no errors.

After the stress-ng test, I have encountered two more random reboots.


When running TrueNAS 25.04 Beta.1 ? If yes, recommend you try 24.10.2 to see how it operates.

Just keep your eyes open for opportunities to change something and test. Only change one thing and then test, make sure that was not the problem before making another change.

Take your time troubleshooting so you don’t skip anything. And 25.04 RC.1 should be released in a few days (Tuesday). But first, strongly recommend test 24.10.2.

Good luck.

Thank you. I will first revert back to 24.10.2 and see if I see the random reboots. Also can you suggest, what BIOS settings I shouldn’t be using?

Also, When I used truenas as a VM in proxmox, I used to also have these random reboots (entire proxmox) but only when truenas vm is started. I couldn’t figure out why and realized it’s better to install truenas on bare metal and thus made the transition.

First, ensure your BIOS is up to date.
Next, Reset the BIOS to factory defaults, unless there is something specific you must configure. If there is, ask about it to see if it matters.
Third, Disable your onboard NIC, it is a Realtek and is apt to give you problems, even if you are not using it, TrueNAS will see it.
Fourth, do not use Fangtooth Beta, I have found problems, many people have found problems, and that is normal for a beta. RC1 is coming out however, I highly recommend you use only 24.10.2 until you are confident the reboot issue no longer occurs.
Fifth, ensure you are not using any form of RAID using the motherboard/BIOS. You will know if you are, you have to set it up and it is very intentional.
Sixth, your HBA, ensure it has GREAT cooling! These things are really designed for a high airflow case and if you have poor airflow, you are probably overheating the HBA. And this could be causing your reboots as well.

And this is a big thing you left out:

It is a very important piece of information. I’m thinking HBA cooling off the top of my head. Think about what is the difference between using TrueNAS and not using TrueNAS. And the onboard NIC, if you were passing that to TrueNAS as well.

Thank you for all the suggestions. This is what I have tried

  1. Reset BIOS settings to default
  2. Disabled Onboard LAN
  3. Changed back to 24.10.2

And even with this, I still observe reboots.

I do not have a cooling fan for HBA yet, which is next on the list to try. I will update you here.

Edit: Is there a way to stress a drive to trigger the reboot just to confirm its the HBA?. I have ordered the cooling fan but it may take a while.

You can run a Scrub, that puts a lot of stress on the HBA.

Yep. Scrub triggered the restart now. Scrub was 25% in and the system rebooted.

Put a fan on the HBA (just direct some household fan to blow air across the HBA, it is a cheap and fast way to verify it is a cooling issue. If you have a dedicated fan you could install, even better, but these HBAs are designed for use in a high airflow case. You know, the screaming demon fans from hell in a server room. Loud!.

If you need some suggestions, snap a few photos of the case and the HBA. I make fan shrouds for my computer fans, the ones I add when I customize a case. The 3D printer is amazing! Before that I would use the cardboard from a cereal box and some tape to build a fan shroud. They actually looked good too and very functional. That may be something you need to do.

Thank you for the suggestions. I am running this setup in Fractal node 804 and have ordered Noctua NF-A4x10 to install on the HBA.

I did some more testing.

  1. I have disabled an app (paperless-ngx) that downloads a lot of stuff and writes it to the hard disk. With this disabled, I left the system, which has been online for 12 hours (currently). Next, I performed the scrub, but that did not result in a reboot.
  2. After 12 hours, I enabled that app, and a random reboot happened.

I am a bit confused now. Is it the network issue that is causing the reboot, or is it the HBA getting heated up during the writing that is causing the reboot?

Waiting for the cooling fan to arrive to rule out one or the other!

Around 10:25, I started the app, and the reboot happened at 10:42

1 Like

At least you’re making progress in narrowing on the cause.

Update:

TrueNAS version - 24.10.2

I have installed the fan and redone the thermal paste, as no thermal paste is left between the chip and the heatsink. It ran for 23 minutes with all the apps running and had a reboot. However, this time, it gave an error message just before the reboot, as shown below.

ar 10 22:20:07 truenas systemd[1]: run-docker-runtime\x2drunc-moby-67a13d0256d161165fd107b8120ffe364acc5d177b499d8177b2be9260a67e89-runc.rouHip.mount: Deactivated >
Mar 10 22:20:16 truenas systemd[1]: run-docker-runtime\x2drunc-moby-ae584a7c9a121b48904c96dea9734d7f66c2ecfd9fdf0d4ad7e317b03d3a6176-runc.HcxDqX.mount: Deactivated >
Mar 10 22:20:20 truenas systemd[1]: run-docker-runtime\x2drunc-moby-1fdf88ef79d8178b9065cbf4447d505ac13edce85394735e93bd32e105dba2a1-runc.UjrLiH.mount: Deactivated >
Mar 10 22:20:21 truenas systemd[1]: run-docker-runtime\x2drunc-moby-a58c81c42a850290aa35797e21bff0595412f64c598bc1b159a1ad714f747711-runc.82mGHo.mount: Deactivated >
Mar 10 22:20:28 truenas systemd[1]: run-docker-runtime\x2drunc-moby-ca5d02dc325444b8999238ccaead0f3462176621027a5bac8c9fd8bc42367261-runc.doMxbh.mount: Deactivated >
Mar 10 22:20:31 truenas systemd[1]: run-docker-runtime\x2drunc-moby-a58c81c42a850290aa35797e21bff0595412f64c598bc1b159a1ad714f747711-runc.NNFAAF.mount: Deactivated >
Mar 10 22:20:50 truenas systemd[1]: run-docker-runtime\x2drunc-moby-1fdf88ef79d8178b9065cbf4447d505ac13edce85394735e93bd32e105dba2a1-runc.BZ3otS.mount: Deactivated >
Mar 10 22:21:30 truenas systemd[1]: run-docker-runtime\x2drunc-moby-1fdf88ef79d8178b9065cbf4447d505ac13edce85394735e93bd32e105dba2a1-runc.NF3ZxA.mount: Deactivated >
Mar 10 22:21:37 truenas kernel: mce: [Hardware Error]: Machine check events logged
Mar 10 22:21:37 truenas kernel: mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 0: 9400004000040150
Mar 10 22:21:37 truenas kernel: mce: [Hardware Error]: TSC 663e0b70ab8 ADDR 1ffffa4176700 
Mar 10 22:21:37 truenas kernel: mce: [Hardware Error]: PROCESSOR 0:906ed TIME 1741641697 SOCKET 0 APIC 4 microcode fc
Mar 10 22:21:38 truenas smartd[3344]: Device: /dev/sdc [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 222 to 162
Mar 10 22:21:38 truenas smartd[3344]: Device: /dev/sda [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 214 to 166
Mar 10 22:21:38 truenas smartd[3344]: Device: /dev/sdd [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 222 to 166
Mar 10 22:21:38 truenas smartd[3344]: Device: /dev/sde [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 214 to 162
Mar 10 22:21:38 truenas smartd[3344]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 222 to 166
Mar 10 22:21:38 truenas smartd[3344]: Device: /dev/sdg [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 214 to 166
Mar 10 22:21:38 truenas smartd[3344]: Device: /dev/sdf [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 75 to 74

I had a second reboot just after this (runtime of less than 5 mins) and for this I did not have the above error message.

Last journalctl messages are below:

Mar 10 22:30:00 truenas systemd[1]: Starting sysstat-collect.service - system activity accounting tool...
Mar 10 22:30:00 truenas systemd[1]: sysstat-collect.service: Deactivated successfully.
Mar 10 22:30:00 truenas systemd[1]: Finished sysstat-collect.service - system activity accounting tool.
Mar 10 22:30:45 truenas chronyd[3434]: Selected source 149.143.87.82 (2.debian.pool.ntp.org)
-- Boot b63f8d21a0414cd4bf29a3258854d645 --
Mar 10 22:32:54 truenas kernel: microcode: updated early: 0xea -> 0xfc, date = 2023-07-27

That is damn sad to hear, but as @etorix said, at least you are narrowing down the issue. The extra cooling is probably worth keeping it.

So, paperless-ngx… I’m not sure how you have this installed on your system but maybe there is some change(s) you could make to address this. Check out the forum for this product, see if there is any similar issues others may have had.

Good luck.

I ruled out the apps too. Even when the apps are not running, I see the random reboots. I am thinking about sleep states. Is there anything in BIOS concerning sleep states I need to turn off or completely disable the sleep states? I see for AMD, this is an issue, but for intel is it also the case?

Installation of apps is done through the APPS section, nothing fancy.

Also, I am on UEFI boot; does that affect anything with the hardware I have?

Had a look at last and found a lot of crashes which stays for a while before it reboots.

truenas_admin@truenas[~]$ last        
truenas_ pts/2                         Tue Mar 11 15:46   still logged in
truenas_ pts/0                         Tue Mar 11 15:33   still logged in
reboot   system boot  6.6.44-debug+tru Tue Mar 11 15:30   still running
reboot   system boot  6.6.44-debug+tru Tue Mar 11 15:10   still running
truenas_ pts/1                         Tue Mar 11 15:03 - crash  (00:06)
truenas_ pts/0                         Tue Mar 11 14:51 - crash  (00:18)
reboot   system boot  6.6.44-debug+tru Tue Mar 11 14:09   still running
truenas_ pts/0                         Tue Mar 11 14:08 - crash  (00:01)
reboot   system boot  6.6.44-debug+tru Tue Mar 11 13:58   still running
truenas_ pts/3                         Tue Mar 11 13:54 - crash  (00:03)
truenas_ pts/2                         Tue Mar 11 13:45 - crash  (00:12)
truenas_ pts/1                         Tue Mar 11 13:27 - crash  (00:30)
truenas_ pts/0                         Tue Mar 11 13:25 - crash  (00:32)
reboot   system boot  6.6.44-debug+tru Tue Mar 11 13:09   still running
truenas_ pts/1                         Tue Mar 11 13:00 - crash  (00:08)
truenas_ pts/0                         Tue Mar 11 12:59 - crash  (00:09)
reboot   system boot  6.6.44-debug+tru Tue Mar 11 12:29   still running
reboot   system boot  6.6.44-productio Tue Mar 11 12:19   still running
reboot   system boot  6.6.44-productio Tue Mar 11 12:10   still running
truenas_ pts/1                         Tue Mar 11 12:07 - crash  (00:02)
truenas_ pts/0                         Tue Mar 11 12:03 - crash  (00:06)

I am further looking into utmpdump ( I am thinking here, last gets the information from wtmp, and viewing wtmp should tell us the event), and viewing that shows it’s a reboot event (or am I interpreting it wrongly?).

truenas_admin@truenas[~]$ sudo utmpdump /var/log/wtmp

Utmp dump of /var/log/wtmp
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-07T08:39:55,556741+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-07T08:41:09,504351+00:00]
[5] [04155] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-07T08:41:11,028674+00:00]
[8] [04155] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-07T08:45:08,981543+00:00]
[1] [00000] [~~  ] [shutdown] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-07T08:45:09,931864+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T18:24:44,575672+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T18:25:27,373448+00:00]
[5] [03209] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-08T18:25:28,524104+00:00]
[7] [12479] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-08T18:30:23,346640+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T19:00:56,428879+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T19:01:37,123180+00:00]
[5] [03451] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-08T19:01:38,364354+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T19:35:49,428099+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T19:36:21,657731+00:00]
[5] [03419] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-08T19:36:23,093948+00:00]
[7] [34261] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-08T19:54:37,728508+00:00]
[7] [39945] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-08T20:00:48,596349+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T20:15:23,220741+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T20:16:19,846674+00:00]
[5] [03497] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-08T20:16:21,295347+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T20:56:33,371644+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T20:57:21,078561+00:00]
[5] [03452] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-08T20:57:22,500413+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T22:02:37,981677+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-08T22:03:13,805478+00:00]
[5] [03413] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-08T22:03:15,167269+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T09:50:31,056240+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T09:51:06,068207+00:00]
[5] [03471] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-09T09:51:07,507699+00:00]
[7] [27538] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-09T10:17:41,139273+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T14:11:38,148872+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T14:12:16,005713+00:00]
[5] [03464] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-09T14:12:17,232149+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T14:21:32,264374+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T14:22:03,073001+00:00]
[5] [03423] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-09T14:22:04,356410+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T17:21:14,122568+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T17:21:46,468794+00:00]
[5] [03428] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-09T17:21:47,746249+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T22:15:53,086112+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T22:16:26,722454+00:00]
[5] [03419] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-09T22:16:28,070753+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T22:36:49,153039+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-09T22:37:22,228422+00:00]
[5] [03434] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-09T22:37:23,329341+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T00:37:31,335276+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T00:38:14,293111+00:00]
[5] [03418] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T00:38:15,386560+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T08:58:48,669465+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T08:59:30,701690+00:00]
[5] [03422] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T08:59:32,121965+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T11:12:28,412917+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T11:13:02,415395+00:00]
[5] [03422] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T11:13:03,582739+00:00]
[7] [93155] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-10T13:40:07,262819+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T14:51:18,394595+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T14:52:02,084769+00:00]
[5] [03431] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T14:52:03,343018+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T15:18:44,237773+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T15:19:13,544027+00:00]
[5] [03475] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T15:19:14,886496+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T17:12:05,488492+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T17:12:38,282422+00:00]
[5] [03456] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T17:12:39,430117+00:00]
[7] [55811] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-10T18:24:49,407090+00:00]
[8] [03456] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T19:29:38,935927+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T20:51:05,453879+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T20:51:38,901921+00:00]
[5] [03481] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T20:51:40,288173+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T21:23:25,133049+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T21:24:10,436390+00:00]
[5] [03473] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T21:24:11,814913+00:00]
[7] [15041] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-10T21:26:12,654967+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T21:32:54,233325+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T21:33:26,305223+00:00]
[5] [03426] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T21:33:27,584443+00:00]
[7] [08652] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-10T21:33:46,331777+00:00]
[7] [11595] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-10T21:34:29,639868+00:00]
[7] [36336] [    ] [truenas_admin] [pts/2       ] [                    ] [0.0.0.0        ] [2025-03-10T21:44:46,907868+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T22:07:17,105699+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T22:07:50,438030+00:00]
[5] [03443] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T22:07:51,804081+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T22:16:57,995084+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T22:17:41,993450+00:00]
[5] [03446] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T22:17:43,188434+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T23:16:36,216752+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-10T23:17:13,248932+00:00]
[5] [03439] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-10T23:17:14,604446+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T02:45:09,169656+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T02:45:43,702156+00:00]
[5] [03421] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T02:45:45,064147+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T06:49:41,360126+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T06:50:14,093776+00:00]
[5] [03431] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T06:50:15,243750+00:00]
[7] [56089] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T08:02:34,697699+00:00]
[7] [82500] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-11T08:44:04,060214+00:00]
[8] [03431] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T08:56:47,182790+00:00]
[1] [00000] [~~  ] [shutdown] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T08:56:57,668809+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T09:04:53,779535+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T09:05:23,031005+00:00]
[5] [03605] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T09:05:24,356065+00:00]
[8] [03605] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T09:15:41,574376+00:00]
[1] [00000] [~~  ] [shutdown] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T09:15:51,973172+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T09:24:33,666423+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T09:25:28,955462+00:00]
[5] [03513] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T09:25:30,094899+00:00]
[7] [26781] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T09:47:44,819406+00:00]
[7] [103032] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-11T10:39:09,646499+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T10:47:41,230373+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T10:48:58,500486+00:00]
[5] [03466] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T10:48:59,937907+00:00]
[7] [12668] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T10:50:53,912606+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T10:58:56,161781+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T11:00:02,319398+00:00]
[5] [03503] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T11:00:03,718353+00:00]
[7] [26090] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T11:03:42,938072+00:00]
[7] [31319] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-11T11:07:42,049416+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T11:10:39,269075+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T11:11:34,482520+00:00]
[5] [03467] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T11:11:35,715645+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T11:19:33,266669+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-production+truenas] [0.0.0.0        ] [2025-03-11T11:20:27,961872+00:00]
[5] [03462] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T11:20:29,163511+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T11:29:49,304114+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T11:31:06,441589+00:00]
[5] [03495] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T11:31:07,847390+00:00]
[7] [56661] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T11:59:52,866583+00:00]
[7] [58088] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-11T12:00:58,159507+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T12:09:49,402178+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T12:10:44,808061+00:00]
[5] [03449] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T12:10:45,941680+00:00]
[7] [40408] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T12:25:54,159481+00:00]
[7] [42050] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-11T12:27:12,820738+00:00]
[7] [63963] [    ] [truenas_admin] [pts/2       ] [                    ] [0.0.0.0        ] [2025-03-11T12:45:18,594754+00:00]
[7] [76551] [    ] [truenas_admin] [pts/3       ] [                    ] [0.0.0.0        ] [2025-03-11T12:54:14,708555+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T12:58:10,105054+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T12:59:05,584977+00:00]
[5] [03467] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T12:59:06,765383+00:00]
[7] [33090] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T13:08:15,021994+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T13:09:42,395279+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T13:10:38,271886+00:00]
[5] [03521] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T13:10:39,518468+00:00]
[7] [72086] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T13:51:51,364029+00:00]
[7] [86829] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-11T14:03:40,936261+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T14:10:21,150115+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T14:11:17,148072+00:00]
[5] [03441] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T14:11:18,269568+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T14:30:42,346786+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T14:31:37,853433+00:00]
[5] [03553] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T14:31:39,269675+00:00]
[7] [12452] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T14:33:01,227050+00:00]
[7] [41649] [    ] [truenas_admin] [pts/2       ] [                    ] [0.0.0.0        ] [2025-03-11T14:46:37,872066+00:00]
[2] [00000] [~~  ] [reboot  ] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T14:51:37,241787+00:00]
[1] [00051] [~~  ] [runlevel] [~           ] [6.6.44-debug+truenas] [0.0.0.0        ] [2025-03-11T14:52:33,071346+00:00]
[5] [03462] [tty1] [        ] [tty1        ] [                    ] [0.0.0.0        ] [2025-03-11T14:52:34,221708+00:00]
[7] [11847] [    ] [truenas_admin] [pts/0       ] [                    ] [0.0.0.0        ] [2025-03-11T14:53:40,042280+00:00]
[7] [35794] [    ] [truenas_admin] [pts/1       ] [                    ] [0.0.0.0        ] [2025-03-11T15:03:56,301541+00:00]
[7] [68026] [    ] [truenas_admin] [pts/2       ] [                    ] [0.0.0.0        ] [2025-03-11T15:28:10,735962+00:00]
[7] [69842] [    ] [truenas_admin] [pts/3       ] [                    ] [0.0.0.0        ] [2025-03-11T15:29:38,578435+00:00]

I have missed to tell this detail. I have my boot ssd which is connected to the motherboard’s SATA port.

Edit: Disabled sleep states on the motherboard - still random reboots. :cry:

Hi.

First, I would like to thank the community for the help, especially @ joeschmuck.

Next, I figured out the issue(s), and finally, I have a stable system :partying_face:.

  1. The CPU I had is 6 x Intel(R) Core™ i5-9600K CPU and core-stepping R0 and the motherboard BIOS I have supports only stepping P0. This caused all the HARDWARE errors I have seen in the journalctl. I have upgraded the CPU i7-8700k with a proper stepping which supports the motherboard and I don’t see any of the hardware errors now.
  2. Cooling of HBA is very important than I thought. I did place a noctua fan in a push configuration towards the bottom of chassis where there is a vent. Unfortunately, that did not help in terms of keeping the HBA temperature in its limit. Only when I placed a 120mm fan blowing across HBA and one more push configuration fan on the bottom of the chassis, the reboots finally stopped.
  3. I have rewired bunch of things and when I tried to POST, it doesn’t. Turns out one of my RAM slots gave up. Luckily, I have the motherboard on warranty and they offered a refund and asked to keep the board. Now I am running of 32gigs of RAM. In the future, I plan to upgrade the motherboard and quick look says they are costly (above 150 euros).

I am looking for a motherboard, microATX, 128GB of RAM, LGA1151 socket, supports i7 8700k. Any suggestions are welcome.

1 Like

Supermicro X11SCH-F/HLN4F are long out of stock or overpriced, but there are plenty of AsRock Rack E3C246D4U2-2T on eBay.
Are you sure you don’t want a Xeon E-2200 and ECC RAM with this? :wink:

Yea, For future proof, I would go that way. Is there any recommended motherboards in that category?

I looked in markplaats near my area for XEON cpus, but couldn’t find any E-2200/2300. I see for example Refurbished Intel Xeon E5-4650 for 130 euros. Is there any Xeon CPU recommendation guide that I can follow?

The one I listed and their siblings.

Xeon E-2100 and E-2200 are LGA1151-2 like your current CPUs.
Xeon E-2300 is LGA1200; boards for that would be X12STH or E3C256D4U.