Imstability with N100 board

Hi,

I have a setup running Truenas scale 25.04.2.6.
System runs on a N100 Chinese NAS board BKHD N100-NAS with a N100 CPU.It has 32GB

(sadly I cant embed a picture or link of the board)

DDR5 Corsair dimm installed and 5 18TB TOSHIBA MG09ACA18TE drives.
Os is installed on a 128GB Patriot NMVE in the m.2 slot.
All is installed in a Jonsbo N2 case and the PSU is a corsair gold sfx 650W psu.

System runs well for a while but experiences a total freeze or even random reboot once in a while.
Pool scrubs ofter show one or 2 corrupted files (when I do a zpool status -v to check) probably because of these freezes or reboots.

I have been reading though the forum and on Redit and most people report the bad DDR5 memory compatibility of this boards and the need to underclock to 4600Mhz etc. But the odd thing is that I have ran several times memtest86 for a full day and 0 errors are reported.

Since the system freezes or reboot I cant find the issue. Also the temps look ok to me during the memtest I believe 60c was the highest but like said no error in the test.
SMART test do not show any issues with the harddisks.

I am lost on what might be causing the instability so hopefully someone has the golden tip.

32 GB is too much for that board. Manufacturer recommends 16 gb max. There are many that have reported problems with this board and 32 gb. As well, make sure the ram chips are not micron. Those don’t play well with this board. Hynix, or Samsung work well.

2 Likes

I had same issue with a BKHD 1264 board. Likely your issue due to RAM incompatibility. Even though the RAM tested fine with Memtest, it would randomly freeze within 24 hours of running.

Solution: Get RAM with ICs manufactured by either SK Hynix or Samsung. Micron manufactured ICs was my issue.

Replaced my stick of Crucial 16GB DDR5 with a stick of Timetec 16GB DDR5 and the problem is gone, system is rock solid.

The issue is the ram module vendor (ie Corsair) could be using ICs from any one of those 3 companies, so you need to do a little research to confirm which IC manufacturer your RAM has. Safe to go with a module made by Samsung or another brand you can confirm which ICs they use (Timetec worked for me, they tend to use SK Hynix).

Hi,

Thank you, I have a corsair dimmm in it this one:
CMSX32GX5M1A4800C40

According to some postings on OMV forum this should work but like you said probably not stable.
Previously I had another one which gave errors during memtest86 I sent it back and replaced it by this one (same model and brand) and this one does not give any errors during memtest86, but like you said it might still be the issue (well I suspect it also).

Sadly memory prices are insane now. But I managed to get a DDR5 dimm from PUSKILL. Not sure which ic’s it as since the site says Hynix, Micron, Samsung, PuSkill so it sounds like a lottery which one you get. Anyway, I bought this memory once for my mini nuc like units I use in my kubernetes lab and was impressed by the quality and stability in combination with the N100 machines I had. So fingers crossed. :slight_smile:

Hmm, sadly the test with the new memory will not fly, just got my ALIEXPRESS package and the contents were a crapy fake kodak 64GB sd card. :frowning:

Memmory prices are insane at the moment so my options are limited (not willing to spend more on a new DIMM then the complete system has cost me).

I have seen messages of people getting the system stable just with an underclock to 4600mhz. I guess I can try that?

The problem is, is that the issue appears randomly. Sometimes it runs for weeks and then it crashes a couple of times a week.
If I had to guess, it looks like when a pool scrub is going on or a container starts which does allot of writing the issue seems to pop-up But like I said is just a feeling.

Can it be the sata cabling?

I use these:

YIWENTEC High Speed 6Gbps SATA 3 iii Cable Sas Cable for Server (6 SATA, 0.5m) (from AMAZON)