ZFS_ARC_MAX issue - out-of-memory errors in kernel with Scale 24.04.1.1

Captain_Morgan · July 4, 2024, 7:15pm

Please report this as a separate bug and start a separate thread.

Frankly 2.5TB is a very large single file and I’m not sure what the limits are both with Samba and ZFS.

Its important to specify your RAM size and record size… my guess is that file size limits will depend on RAM

Joel_Gray · July 5, 2024, 1:59pm

I will get a further bug report submitted for the audit logs.

ZFS supports 16 exbibytes, I don’t think I am at risk of exceeding that.
As for samba, I took it out of the mix by attempting to delete the same file from shell on the NAS with the same result.

A file deletion should not be a particularly IO intensive operation, and there is no reason it should need a large amount of ram to complete. As for creating the file, truenas doesn’t know it’s going to 2.5tb, it crashes randomly but usually before 20-30gb written. There is something about certain types of data, usually various types of backup files from different systems that are not playing nicely, in this case Veeam and previously Proxmox.

The system currently has 32gb of ECC ram with more on the way, no vm’s, and is supporting a small 24tb raid-z2 pool plus 4tb NVME pool.

awalkerix · July 5, 2024, 2:15pm

Is that a single 2.5 TiB file or a few millions of files within a directory. For example MacOS sparsebundle volumes are actually directories containing tons of files. A single file should not generate very many audit messages. If the backup client is writing to millions of files at once then you’ll end up with millions of audit entries. One workaround in this case is to whitelist the particular account that is used for backups to bypass auditing over SMB (this is exposed in the UI).

Captain_Morgan · July 5, 2024, 4:01pm

Pefer a separate thread… this is highly unusual and may be hardware related. So full hardware specs would be useful an d check for any other errors.

Joel_Gray · July 10, 2024, 9:49am

In response to @awalkerix, it was just a single 2.5tb veeam compressed file.
I have not been able to easily replicate this but making a backup that big with without causing snapshot headaches is difficult, I think it was just due to memory pressure at the time which crashed everything.

I am still seeing intermittent OOM errors for intensive operations, albeit less frequently, with the zfs arc shrinker limit and arc pc percent change. Pushing up to 24.04.2 tonight. Occurring approx. once every 3-4 days and does not coincide with any scheduled tasks etc. only with heavy usage of the pool. While it may be a red-herring, it does seem to occur more with pre-compressed content.

sra · July 10, 2024, 2:56pm

@Joel_Gray I was following your issue for a while, and I was hoping this would be resolved in 24.04.2.

Unfortunately, I can confirm the issue is still there. In my case it is not NFS but SMB causing OOM killer to kill itself and other high memory processes.

I have created a new issue [NAS-129987] - iXsystems TrueNAS Jira. Fingers crossed this can be fixed soon.

Joel_Gray · July 12, 2024, 12:21pm

@sra thanks for logging the ticket. Just for reference, on the latest version I can reproduce the same issues you are facing with SMB as well, still with no VMs etc running.

I have unfortunately gone from an extremely stable core installation that never even had a hiccup, to fatal issues occuring almost daily with any sort of demanding load.

I am certain this is not hardware induced, the same memory behaviour is observable across the different workload failures.

NFS was resolved at the expense of huge ARC dips with the zfs shrinkage setting and no limits on it, but it didn’t resolve the SMB use-case and running the default settings on the latest release is still causing a lot of problems with both of them.

Hopefully some real testing can be done around this, as in my opinion it should not have been considered a stable release candidate.

kris · July 12, 2024, 1:56pm

Joel, do you have a ticket you are working on with us directly? This is a very active investigation on our part and we suspect there are a combination of factors in play here that lead to this behavior. Especially since there are a really low number of reports and we don’t have a reproduction case when we put systems through our own performance stress testing. Once we nail down what the specific variables in play are, it’ll make resolution that much easier

Milkysunshine · July 16, 2024, 4:11pm

I was having the same issue with my samba service being killed, but running then adding the echo "0" > /sys/module/zfs/parameters/zfs_arc_shrinker_limit to init seems to have fixed it.

I run TrueNAS in a vm on a dell R730xd running Proxmox with 384GB ram. I have 96GB allocated to TrueNAS, and there are no VMs or apps running within the vm. Disks and nvme drives are passed through directly to the TrueNAS vm.

The service would be killed during backup of proxmox VMs to a share on TrueNAS. Since the data never leaves the machine, the read/write load on the system isn’t bottle necked by the network. No other network based backup seemed to cause this, even over 10gbit.

I never had the issue on older versions of TrueNAS, or any issues for that matter.

Even though I haven’t had the issue in weeks ( 18 days to be exact ), should I still submit a ticket? I’m not sure if the configuration it contains might help identify commonalities between setups having this issue.

mav · July 16, 2024, 5:56pm

@Milkysunshine In the OOM killer messages in your /var/log/kern.log you should see amount of memory used by each process when it happened. We saw several reports when killed samba process occupied many (8-16) gigabytes of RAM. That should not be normal, and may be a valid reason for it to be killed. Our services team is notified and collecting the data. But if you have one more data point of that, it would be good to know. Otherwise if setting zfs_arc_shrinker_limit to 0 fixed the problem then there is not much to do, since it is a part of already released TrueNAS 24.04.2.

Milkysunshine · July 16, 2024, 6:13pm

cat kern.log | grep oom
Jun 22 08:31:01 zion kernel: DBENGINE invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=-900
Jun 22 08:31:01 zion kernel:  oom_kill_process+0xf9/0x190
Jun 22 08:31:01 zion kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Jun 22 08:31:01 zion kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/system.slice/smbd.service,task=smbd[10.69.10.1,pid=9436,uid=1000
Jun 22 08:31:01 zion kernel: Out of memory: Killed process 9436 (smbd[10.69.10.1) total-vm:3987592kB, anon-rss:3881956kB, file-rss:3072kB, shmem-rss:16836kB, UID:1000 pgtables:7788kB oom_score_adj:0
Jun 22 08:31:03 zion kernel: oom_reaper: reaped process 9436 (smbd[10.69.10.1), now anon-rss:0kB, file-rss:3072kB, shmem-rss:16836kB
Jun 29 08:54:05 zion kernel: smbd invoked oom-killer: gfp_mask=0x140cca(GFP_HIGHUSER_MOVABLE|__GFP_COMP), order=0, oom_score_adj=0
Jun 29 08:54:05 zion kernel:  oom_kill_process+0xf9/0x190
Jun 29 08:54:05 zion kernel: [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
Jun 29 08:54:05 zion kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0-1,global_oom,task_memcg=/system.slice/smbd.service,task=smbd[10.69.10.1,pid=9153,uid=1000
Jun 29 08:54:05 zion kernel: Out of memory: Killed process 9153 (smbd[10.69.10.1) total-vm:5854592kB, anon-rss:5756240kB, file-rss:3840kB, shmem-rss:10900kB, UID:1000 pgtables:11444kB oom_score_adj:0
Jun 29 08:54:07 zion kernel: oom_reaper: reaped process 9153 (smbd[10.69.10.1), now anon-rss:0kB, file-rss:3148kB, shmem-rss:10900kB

Is this useful?

awalkerix · July 16, 2024, 6:44pm

Can you send me a debug via private message.

Milkysunshine · July 16, 2024, 8:16pm

My trust level is too low to send PMs

awalkerix · July 16, 2024, 8:32pm

You can create a jira ticket and I’ll communicate with you through it. This is only to investigate root cause of the SMB process high memory usage.

jpeaglesandkatz · July 23, 2024, 9:54am

Same here… Ever since the last or the previous update truenas scale is very frequently crashing… usually without any visually message but sometimes with an OOM message of some sorts.

I’ve been running truenas for years on proxmox without a hitch but the last weeks it is nothing but problems.

I have 24 gigs commited to Truenas… Total system memory has enough free… I used to have only a swap size of 2 gb but have since upped it to 24 gigs. Same result.

Intermittant crashes usually when I’m writing a lot to the nas.

I did do a complete fresh reinstall of the VM on this latest version.

awalkerix · July 23, 2024, 12:53pm

Maybe check RES memory for smbd processes if you’re using SMB service for loopback mounts from apps or connections from proxmox host. There are some backup applications that appear to infinitely queue up writes to the SMB server and over very fast link (like loopback) it can end up with excessively large smbd queue depths waiting on writes to complete.

Joel_Gray · July 26, 2024, 6:17am

Hi Mav, I’ll get a new ticket spun up for the issues in the latest version as requested.

Currently, the latest version is extremely unstable out of the box compared to the prior version, when said prior version was configured with only the zfs_arc_shrinker_limit set to 0, not applying “zfs_arc_pc_percent=300”.

What was the previous default value for zfs_arc_pc_percent"? I will change this to whatever the value was in the prior release and leave the new default shrinker limit. When only the shrinker limit was set but this value was not changed, the system was comparatively more stable.

Currently, with the stock settings in the latest release, proxmox is back to crashing the system multiple times a day via NFS again, and SMB is just as bad with any data incoming from VEEAM.

mav · July 26, 2024, 4:30pm

The previous was “0”, which means not enforced. I have some doubts it can be the cause of lower stability, but I’ll consider for now.

I’m sorry to hear about that. I have several patches for the area in upstream review now, hope they help when finally merged. Running internal tests for them meanwhile.

Dmitry · October 18, 2024, 10:28am

Hello. The errors have not been corrected. Dragonfish-24.04.2.2.
kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/system.slice/smbd.service,task=smbd[192.168.1.,pid=4904,uid=1001
kernel: Out of memory: Killed process 4904 (smbd[192.168.1.) total-vm:9553312kB, anon-rss:9456048kB, file-rss:128kB, shmem-rss:13604kB, UID:1001 pgtables:18680kB oom_score_adj:0
systemd[1]: smbd.service: A process of this unit has been killed by the OOM killer.
I have more data but can’t upload it here.

mav · October 18, 2024, 3:25pm

We’ve done more memory-related improvements in upcoming TrueNAS Electric Eel 24.10. You might want to test it also, many thousands of people already do. But if you’d like to provide us more data, you can always create a ticket and use private attachments there that only iX developers can see.