System Crash Transferring Files

Hello,

I am new to both using TrueNAS and having a NAS at all. I setup my server and everything works perfectly fine except when I start transferring files. Transferring large amounts of data (25+ Gigs) causes the server to crash. Note that this happens when I transfer it over the network and between files on the server itself. (So transfers from Folder A to Folder B) One of the things I intended to do with my server is use it as a media server so I did have JellyFin setup and any files I did get on there did work without issue.

What I have done thus far is the following:
-Runs Status Checks, no Read, Write, or Checksum Errors but noted corrupted files. I removed the corrupt files. (movie files)
-Scrub
-SMART check (Long), said my drives were good.
-Removed zpool, cleared data from drives, rebuilt zpool
-I checked to make sure there were no overheating issues and there are none. Max temps were 60C at any given time most temps stayed around 40C-45C

Machine Specs:
TerraMaster F6-424 (N95 Quad-Core CPU)
Crucial 32GB DDR5 RAM, 4800MHz CL40
4xSeagate IronWolf 4TB NAS Hard Drives
TrueNAS Version: ElectricEel-24.10.2.4
BIOS Version: 2.22.1287 (latest and only version)

Any help would towards solving this issue would be appreciated. If you need me to provide something please also let me know how to get it or where to find it.

I also wanted to add that I did re-seat my drives and memory.

Welcome to TrueNAS Forums.

First, you did great trying to figure this out yourself, many people will not make this effort.

How were you moving file from Folder A to Folder B? Were you using a different computer to move these files? If yes, then we should try again, use a copy command and do an internal test to see if it is the NAS or the other computer you are using. I will need specifics to provide you a proper command. I need the full path of the folder from and to. Example: /mnt/mypool/folder_a and /mnt/mypool/folder_b.

I would recommend you upgrade to the current version of TrueNAS. That is the first step.

I’m not sure what you mean here. Zpool is a command in Linux.

Did you perform any stress tests? MemTest86+ and Prime95? Do you know what kind of NIC is there?

The problem you are having, I typically see with a RealTek NIC as it offloads all the work to the CPU. Maybe you have an Intel NIC but you should find out.

If you are having ZFS cksum errors, this is concerning.

2 Likes

Yes I was using a separate computer, going to the network folders and moving them manually. I believe the locations would be the following:
/mnt/HDDs/Matthew <<-- no current data but can add to test
/mnt/HDDs/Media/Movies
At the time I was moving a large movie file from Matthew to Media.

Unfortunately I am also using HexOS and it is my understanding that updating to the current TrueNAS version could break it.

My mistake on the terminology, I Disconnected the pool, destroyed all data and deleted saved configs from the Storage UI, then made a new pool.

The “tests” I have done were experimenting with transferring different files, groups of files, and locations of files. Breaking the transfers in to smaller sizes seems to have less to no issues. Large Transfers if big enough all eventually fail.

I did not run any tests like the ones mentioned and would probably require some instruction on what I should do to run them.

After some quick searching NIC is either Network Interface Controller or Network Interface Card. I am not sure how/where to find this information, I assume it is based on the hardware I have but I do not see it in the specs just the number of RJ-45 2.5 GbE network jacks (2).

If you could give me some guidance how to find out if it it a RealTek NIC I will let you know as soon as I can.

Note that the CheckSum errors were popping up after the system crashes during files transfers. Not necessarily every time, but over multiple crashes. As of right now, I am at one crash and no current Checksum Errors. The system has been up for the past 13 hours and I had no issues transferring 15Gb of music.

You can download the memtest ISO and boot the computer from a USB stick. The default options are fine. Just let it run through at least one full pass.

1 Like

I will try it, not sure if the system will boot from the external USB port (I have heard people had issues booting from it on my hardware), I otherwise would need to completely take apart the whole NAS so I can access an internal USB port under the motherboard.

If you have a way to run a test that would avoid removing the motherboard from the housing (if the external USB port fails) please let me know.

Some Linux distros include memtest as one of the options in the Grub menu.

Since TrueNAS SCALE/CE is technically Debian underneath, and it uses Grub, you can hold down ESC or Left Shift during boot to pause the menu screen.

This is what an Ubuntu Grub menu looks like. They include memtest by default.

Thanks for the information, I was in fact able to get Memtest86+ running off a USB and it has done 1 Pass with no Errors.

This command should help.
lspci | grep -i ethernet

You should run it for 5 complete passes, or longer.

The Ultimate Boot CD (UBCD) contains Prime95, MemTest86+ and a lot more. Do a search on the internet for it. Be very careful what you click on. There are a ton of adds which support the project, but I just hate all those ads and they trick you with the word “Download”. It is there to download, then you can burn the ISO to a Flash drive.

As for HexOS, I’m not sure about troubleshooting it. HexOS should support the current version of TrueNAS scale, it’s been out for quite a while, but you will need to check into that.

1 Like

I ran the provided command and got the following results:

lspci | grep -i ethernet
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8125 2.5GbE Controller (rev 05)

I ran Memtest for only two passes, I will do a more extensive test (5+) later on. As far as The Ultimate Boot CD, are you saying you want me to run something off it or are just giving me useful info on tools for the future?

It is true the HexOS has been around for a while but it is still a work in progress. The admins on their website have stated that they are only good through 24.10 and to update at your own risk.

You mean that local transfers on the server, bypassing your network, still causes the crash?

I accessed the files through the network via a separate computer to move them from " /mnt/HDDs/Matthew " to " /mnt/HDDs/Media/Movies " and it crashed the server. Not sure if initiating the transfer from a separate machine makes a difference. I have moved several files in this fashion its just larger transfers (a collection of files or a signal large file) that causes the crash.

I started running Memtest again and checked on things after 2 Passes and had 2 errors.

Memory: Crucial 32GB DDR5 RAM, 4800MHz CL40

CPU:  4 Cores 4 Threads  SMP: 4T (PAR)
RAM:  2400MHz  (DDR5-4800)  CAS 40-39-39-77
pCPU   Pass   Test   Failing Address        Expected          Found
0      0      5      000847503868(33.1GB)   7ad0272b488d2dd9  7ad02f2b488d2dd9
0      1      9      0002005f0840(8GB)      492d7684699ca369  492dde84699c0169  

Could this memory error be the cause of my issues?
Should I be looking to get different memory?

1 Like

I would return it, if past return window, you could likely RMA it; most ram has lifetime warranty now-a-days.

Failing RAM could 100% be part of literally any issue imaginable that you could experience on a computer, imo. What’s the saying? Random issue; check ram & psu.

1 Like

This wouldn’t be the first time.

1 Like

Are you using any XMP or overclocking profiles in the BIOS?

No, there are no options in the Bios for XMP (or anything similar) and nothing is overclocked.

Then you’ll have to decide what you want to do next. Isolate the bad stick? Attempt to RMA? Take the loss and just order a new 32GB RAM kit?

I will try getting another ram stick. This one was non-ECC it sounds like I should be looking for something with On-Die ECC. Was looking at the following

Kingston Fury Impact

It’s only one stick? A single stick of 32GB?