Docker Apps crash system due to oom-killer

Plex is idle atm - nobody is watching anything.

Unfortunately the screenshot is in German

Memory usage is pretty low atm - Home Assistant is the highest at 825MB

Total volume of the backup is around 1.2TB - but as it is snapshots based it varies a little bit.

Pool usage:
pool1 - Usable Capacity: 30.78 TiB - Used: 28.51 TiB - Available: 2.27 TiB
pool2 - Usable Capacity: 1.68 TiB - Used: 1.18 TiB - Available: 509.27 GiB

As data space is always low - could it be also related to this?

You should have been seeing warning in the GUI about POOL1. Recomendations are below 80% or 50%, if block type storage. Above 90% you can start to see problems. These are more ‘health’ related than your OOM, issue, I think

See ZFS Primer | TrueNAS Documentation Hub

If using block storage
BLOCK STORAGE

Low free space may cause memory starvation issues if there are lots of write operations or a few large write operations ongoing. The I/O slows down due to the lack of free space and the write buffers in the ARC fill up, then the application generating the I/O has to slow it’s writes potentially leading to more memory consumption.

I suspect the mbuffer processes are related to the replication via an ISP connection, but those alone should not drive the system out of memory. It may be a combination of the replication and something else that runs that is memory hungry.

I would watch the memory usage and kick off a replication and see how much memory is used.

Thank you very much for the useful information and effort in helping me to find the cause.
I am trying to reduce space usage and then I start the replication jobs again.

You mentioned you have SMB shares. Are you by any chances connecting any clients using SMB v1? My experience was that SMB v1 in SCALE causes connection leakage (the clients keep creating new connections and the old connections never close) and since each connection creates its own process with memory allocation eventually you will run out of memory and the oom-killer will step in creating havoc on the apps (which were not the culprit to begin with). I was not able to fix this and since those clients did not support SMB v2/3 my only option was to switch to NFS and that eliminated the problem. SMB v2+ was not an issue, this was only when using SMBv1. Just mentioning this in case you are using that.

1 Like

Thank you for this hint.
But actually no - SMB v1 is disabled.

This night, only stash app crashed - I started a “generate metadata” task a few days ago - this needs some time because the library is big.

Jul  1 04:59:48 nas kernel: av:h264:df15 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Jul  1 04:59:48 nas kernel: CPU: 5 UID: 0 PID: 3417202 Comm: av:h264:df15 Tainted: P           OE      6.12.15-production+truenas #1
Jul  1 04:59:48 nas kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Jul  1 04:59:48 nas kernel: Hardware name: Supermicro Super Server/X11SDV-8C+-TLN2F, BIOS 1.2 11/14/2019
Jul  1 04:59:48 nas kernel: Call Trace:
Jul  1 04:59:48 nas kernel:  <TASK>
Jul  1 04:59:48 nas kernel:  dump_stack_lvl+0x64/0x80
Jul  1 04:59:48 nas kernel:  dump_header+0x43/0x160
Jul  1 04:59:48 nas kernel:  oom_kill_process+0xfa/0x200
Jul  1 04:59:48 nas kernel:  out_of_memory+0x257/0x520
Jul  1 04:59:48 nas kernel:  mem_cgroup_out_of_memory+0x12a/0x140
Jul  1 04:59:48 nas kernel:  try_charge_memcg+0x491/0x620
Jul  1 04:59:48 nas kernel:  __mem_cgroup_charge+0x42/0xd0
Jul  1 04:59:48 nas kernel:  filemap_add_folio+0x47/0xe0
Jul  1 04:59:48 nas kernel:  __filemap_get_folio+0x17c/0x2e0
Jul  1 04:59:48 nas kernel:  filemap_fault+0x647/0xd40
Jul  1 04:59:48 nas kernel:  __do_fault+0x30/0x170
Jul  1 04:59:48 nas kernel:  do_fault+0x2a8/0x4d0
Jul  1 04:59:48 nas kernel:  __handle_mm_fault+0x7b8/0xfd0
Jul  1 04:59:48 nas kernel:  handle_mm_fault+0x180/0x2d0
Jul  1 04:59:48 nas kernel:  do_user_addr_fault+0x20c/0x670
Jul  1 04:59:48 nas kernel:  exc_page_fault+0x76/0x190
Jul  1 04:59:48 nas kernel:  asm_exc_page_fault+0x26/0x30
Jul  1 04:59:48 nas kernel: RIP: 0033:0x11f485b
Jul  1 04:59:48 nas kernel: Code: Unable to access opcode bytes at 0x11f4831.
Jul  1 04:59:48 nas kernel: RSP: 002b:00007f99225a2928 EFLAGS: 00010203
Jul  1 04:59:48 nas kernel: RAX: 0000000000000000 RBX: 000000003085bcc0 RCX: 0000000000000001
Jul  1 04:59:48 nas kernel: RDX: 0000000000000003 RSI: 0000000000000780 RDI: 00007f9920b26860
Jul  1 04:59:48 nas kernel: RBP: 0000000000000011 R08: 00007f9920b251e0 R09: 000000000430e420
Jul  1 04:59:48 nas kernel: R10: 0000000000000018 R11: 0000000000002c98 R12: 0000000000000001
Jul  1 04:59:48 nas kernel: R13: 0000000000000011 R14: 000000000430e380 R15: 00007f9920b26860
Jul  1 04:59:48 nas kernel:  </TASK>
Jul  1 04:59:48 nas kernel: memory: usage 4194304kB, limit 4194304kB, failcnt 16117
Jul  1 04:59:48 nas kernel: swap: usage 0kB, limit 4194304kB, failcnt 0
Jul  1 04:59:48 nas kernel: Memory cgroup stats for /docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339:
Jul  1 04:59:48 nas kernel: anon 4276641792
Jul  1 04:59:48 nas kernel: file 45056
Jul  1 04:59:48 nas kernel: kernel 18280448
Jul  1 04:59:48 nas kernel: kernel_stack 3932160
Jul  1 04:59:48 nas kernel: pagetables 11415552
Jul  1 04:59:48 nas kernel: sec_pagetables 0
Jul  1 04:59:48 nas kernel: percpu 8136
Jul  1 04:59:48 nas kernel: sock 0
Jul  1 04:59:48 nas kernel: vmalloc 12288
Jul  1 04:59:48 nas kernel: shmem 0
Jul  1 04:59:48 nas kernel: zswap 0
Jul  1 04:59:48 nas kernel: zswapped 0
Jul  1 04:59:48 nas kernel: file_mapped 12288
Jul  1 04:59:48 nas kernel: file_dirty 0
Jul  1 04:59:48 nas kernel: file_writeback 32768
Jul  1 04:59:48 nas kernel: swapcached 0
Jul  1 04:59:48 nas kernel: anon_thp 341835776
Jul  1 04:59:48 nas kernel: file_thp 0
Jul  1 04:59:48 nas kernel: shmem_thp 0
Jul  1 04:59:48 nas kernel: inactive_anon 4276617216
Jul  1 04:59:48 nas kernel: active_anon 24576
Jul  1 04:59:48 nas kernel: inactive_file 16384
Jul  1 04:59:48 nas kernel: active_file 16384
Jul  1 04:59:48 nas kernel: unevictable 0
Jul  1 04:59:48 nas kernel: slab_reclaimable 338744
Jul  1 04:59:48 nas kernel: slab_unreclaimable 2499776
Jul  1 04:59:48 nas kernel: slab 2838520
Jul  1 04:59:48 nas kernel: workingset_refault_anon 0
Jul  1 04:59:48 nas kernel: workingset_refault_file 8972
Jul  1 04:59:48 nas kernel: workingset_activate_anon 0
Jul  1 04:59:48 nas kernel: workingset_activate_file 2491
Jul  1 04:59:48 nas kernel: workingset_restore_anon 0
Jul  1 04:59:48 nas kernel: workingset_restore_file 2377
Jul  1 04:59:48 nas kernel: workingset_nodereclaim 0
Jul  1 04:59:48 nas kernel: pgdemote_kswapd 0
Jul  1 04:59:48 nas kernel: pgdemote_direct 0
Jul  1 04:59:48 nas kernel: pgdemote_khugepaged 0
Jul  1 04:59:48 nas kernel: pgpromote_success 0
Jul  1 04:59:48 nas kernel: pgscan 41184
Jul  1 04:59:48 nas kernel: pgsteal 21089
Jul  1 04:59:48 nas kernel: pgscan_kswapd 8656
Jul  1 04:59:48 nas kernel: pgscan_direct 32415
Jul  1 04:59:48 nas kernel: pgscan_khugepaged 113
Jul  1 04:59:48 nas kernel: pgsteal_kswapd 1849
Jul  1 04:59:48 nas kernel: pgsteal_direct 19218
Jul  1 04:59:48 nas kernel: pgsteal_khugepaged 22
Jul  1 04:59:48 nas kernel: pgfault 14192103254
Jul  1 04:59:48 nas kernel: pgmajfault 55473
Jul  1 04:59:48 nas kernel: pgrefill 40926
Jul  1 04:59:48 nas kernel: pgactivate 25938
Jul  1 04:59:48 nas kernel: pgdeactivate 23582
Jul  1 04:59:48 nas kernel: pglazyfree 0
Jul  1 04:59:48 nas kernel: pglazyfreed 0
Jul  1 04:59:48 nas kernel: swpin_zero 0
Jul  1 04:59:48 nas kernel: swpout_zero 0
Jul  1 04:59:48 nas kernel: zswpin 0
Jul  1 04:59:48 nas kernel: zswpout 0
Jul  1 04:59:48 nas kernel: zswpwb 0
Jul  1 04:59:48 nas kernel: thp_fault_alloc 1860625
Jul  1 04:59:48 nas kernel: thp_collapse_alloc 55271
Jul  1 04:59:48 nas kernel: thp_swpout 0
Jul  1 04:59:48 nas kernel: thp_swpout_fallback 0
Jul  1 04:59:48 nas kernel: numa_pages_migrated 0
Jul  1 04:59:48 nas kernel: numa_pte_updates 0
Jul  1 04:59:48 nas kernel: numa_hint_faults 0
Jul  1 04:59:48 nas kernel: Tasks state (memory values in pages):
Jul  1 04:59:48 nas kernel: [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
Jul  1 04:59:48 nas kernel: [ 134793]     0 134793  1500493   386679   386651       28         0  4349952        0             0 stash
Jul  1 04:59:48 nas kernel: [3399367]     0 3399367   906372   517772   517675       97         0  4497408        0             0 ffmpeg
Jul  1 04:59:48 nas kernel: [3416849]     0 3416849   449833    44435    44314      121         0   737280        0             0 ffmpeg
Jul  1 04:59:48 nas kernel: [3417011]     0 3417011   568199    52070    51977       93         0   880640        0             0 ffmpeg
Jul  1 04:59:48 nas kernel: [3417182]     0 3417182   434466    35896    35852       44         0   671744        0             0 ffmpeg
Jul  1 04:59:48 nas kernel: [3417328]     0 3417328   116139     6648     6519      129         0   278528        0             0 ffmpeg
Jul  1 04:59:48 nas kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,mems_allowed=0,oom_memcg=/docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,task_memcg=/docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,task=ffmpeg,pid=3399367,uid=0
Jul  1 05:03:13 nas kernel: av:h264:df10 invoked oom-killer: gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0
Jul  1 05:03:13 nas kernel: CPU: 14 UID: 0 PID: 3417841 Comm: av:h264:df10 Tainted: P           OE      6.12.15-production+truenas #1
Jul  1 05:03:13 nas kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Jul  1 05:03:13 nas kernel: Hardware name: Supermicro Super Server/X11SDV-8C+-TLN2F, BIOS 1.2 11/14/2019
Jul  1 05:03:13 nas kernel: Call Trace:
Jul  1 05:03:13 nas kernel:  <TASK>
Jul  1 05:03:13 nas kernel:  dump_stack_lvl+0x64/0x80
Jul  1 05:03:13 nas kernel:  dump_header+0x43/0x160
Jul  1 05:03:13 nas kernel:  oom_kill_process+0xfa/0x200
Jul  1 05:03:13 nas kernel:  out_of_memory+0x257/0x520
Jul  1 05:03:13 nas kernel:  mem_cgroup_out_of_memory+0x12a/0x140
Jul  1 05:03:13 nas kernel:  try_charge_memcg+0x491/0x620
Jul  1 05:03:13 nas kernel:  __mem_cgroup_charge+0x42/0xd0
Jul  1 05:03:13 nas kernel:  filemap_add_folio+0x47/0xe0
Jul  1 05:03:13 nas kernel:  __filemap_get_folio+0x17c/0x2e0
Jul  1 05:03:13 nas kernel:  filemap_fault+0x647/0xd40
Jul  1 05:03:13 nas kernel:  __do_fault+0x30/0x170
Jul  1 05:03:13 nas kernel:  do_fault+0x2a8/0x4d0
Jul  1 05:03:13 nas kernel:  __handle_mm_fault+0x7b8/0xfd0
Jul  1 05:03:13 nas kernel:  handle_mm_fault+0x180/0x2d0
Jul  1 05:03:13 nas kernel:  do_user_addr_fault+0x20c/0x670
Jul  1 05:03:13 nas kernel:  exc_page_fault+0x76/0x190
Jul  1 05:03:13 nas kernel:  asm_exc_page_fault+0x26/0x30
Jul  1 05:03:13 nas kernel: RIP: 0033:0x108ee23
Jul  1 05:03:13 nas kernel: Code: Unable to access opcode bytes at 0x108edf9.
Jul  1 05:03:13 nas kernel: RSP: 002b:00007f1812e58040 EFLAGS: 00010a07
Jul  1 05:03:13 nas kernel: RAX: 00000000000001ff RBX: 000000000da4f9c0 RCX: 0000000004111f60
Jul  1 05:03:13 nas kernel: RDX: 0000000000000029 RSI: 0000000000000f77 RDI: 00007f17d02fa842
Jul  1 05:03:13 nas kernel: RBP: 000000000d994780 R08: 000000000000001b R09: 000000000da56950
Jul  1 05:03:13 nas kernel: R10: 00007f17d02fa842 R11: 0000000000000000 R12: 00007f17d0000940
Jul  1 05:03:13 nas kernel: R13: 0000000000000003 R14: 000000000da55238 R15: 0000000000000001
Jul  1 05:03:13 nas kernel:  </TASK>
Jul  1 05:03:13 nas kernel: memory: usage 4194304kB, limit 4194304kB, failcnt 22264
Jul  1 05:03:13 nas kernel: swap: usage 0kB, limit 4194304kB, failcnt 0
Jul  1 05:03:13 nas kernel: Memory cgroup stats for /docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339:
Jul  1 05:03:13 nas kernel: anon 4276023296
Jul  1 05:03:13 nas kernel: file 16384
Jul  1 05:03:13 nas kernel: kernel 18874368
Jul  1 05:03:13 nas kernel: kernel_stack 4177920
Jul  1 05:03:13 nas kernel: pagetables 11608064
Jul  1 05:03:13 nas kernel: sec_pagetables 0
Jul  1 05:03:13 nas kernel: percpu 7848
Jul  1 05:03:13 nas kernel: sock 0
Jul  1 05:03:13 nas kernel: vmalloc 12288
Jul  1 05:03:13 nas kernel: shmem 0
Jul  1 05:03:13 nas kernel: zswap 0
Jul  1 05:03:13 nas kernel: zswapped 0
Jul  1 05:03:13 nas kernel: file_mapped 16384
Jul  1 05:03:13 nas kernel: file_dirty 0
Jul  1 05:03:13 nas kernel: file_writeback 0
Jul  1 05:03:13 nas kernel: swapcached 0
Jul  1 05:03:13 nas kernel: anon_thp 281018368
Jul  1 05:03:13 nas kernel: file_thp 0
Jul  1 05:03:13 nas kernel: shmem_thp 0
Jul  1 05:03:13 nas kernel: inactive_anon 4275998720
Jul  1 05:03:13 nas kernel: active_anon 24576
Jul  1 05:03:13 nas kernel: inactive_file 0
Jul  1 05:03:13 nas kernel: active_file 0
Jul  1 05:03:13 nas kernel: unevictable 0
Jul  1 05:03:13 nas kernel: slab_reclaimable 341744
Jul  1 05:03:13 nas kernel: slab_unreclaimable 2663432
Jul  1 05:03:13 nas kernel: slab 3005176
Jul  1 05:03:13 nas kernel: workingset_refault_anon 0
Jul  1 05:03:13 nas kernel: workingset_refault_file 15544
Jul  1 05:03:13 nas kernel: workingset_activate_anon 0
Jul  1 05:03:13 nas kernel: workingset_activate_file 4161
Jul  1 05:03:13 nas kernel: workingset_restore_anon 0
Jul  1 05:03:13 nas kernel: workingset_restore_file 3995
Jul  1 05:03:13 nas kernel: workingset_nodereclaim 0
Jul  1 05:03:13 nas kernel: pgdemote_kswapd 0
Jul  1 05:03:13 nas kernel: pgdemote_direct 0
Jul  1 05:03:13 nas kernel: pgdemote_khugepaged 0
Jul  1 05:03:13 nas kernel: pgpromote_success 0
Jul  1 05:03:13 nas kernel: pgscan 53099
Jul  1 05:03:13 nas kernel: pgsteal 27660
Jul  1 05:03:13 nas kernel: pgscan_kswapd 8656
Jul  1 05:03:13 nas kernel: pgscan_direct 43105
Jul  1 05:03:13 nas kernel: pgscan_khugepaged 1338
Jul  1 05:03:13 nas kernel: pgsteal_kswapd 1849
Jul  1 05:03:13 nas kernel: pgsteal_direct 24989
Jul  1 05:03:13 nas kernel: pgsteal_khugepaged 822
Jul  1 05:03:13 nas kernel: pgfault 14199970225
Jul  1 05:03:13 nas kernel: pgmajfault 62161
Jul  1 05:03:13 nas kernel: pgrefill 55724
Jul  1 05:03:13 nas kernel: pgactivate 32010
Jul  1 05:03:13 nas kernel: pgdeactivate 31322
Jul  1 05:03:13 nas kernel: pglazyfree 0
Jul  1 05:03:13 nas kernel: pglazyfreed 0
Jul  1 05:03:13 nas kernel: swpin_zero 0
Jul  1 05:03:13 nas kernel: swpout_zero 0
Jul  1 05:03:13 nas kernel: zswpin 0
Jul  1 05:03:13 nas kernel: zswpout 0
Jul  1 05:03:13 nas kernel: zswpwb 0
Jul  1 05:03:13 nas kernel: thp_fault_alloc 1861228
Jul  1 05:03:13 nas kernel: thp_collapse_alloc 55352
Jul  1 05:03:13 nas kernel: thp_swpout 0
Jul  1 05:03:13 nas kernel: thp_swpout_fallback 0
Jul  1 05:03:13 nas kernel: numa_pages_migrated 0
Jul  1 05:03:13 nas kernel: numa_pte_updates 0
Jul  1 05:03:13 nas kernel: numa_hint_faults 0
Jul  1 05:03:13 nas kernel: Tasks state (memory values in pages):
Jul  1 05:03:13 nas kernel: [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
Jul  1 05:03:13 nas kernel: [ 134793]     0 134793  1500485   363902   363746      156         0  4243456        0             0 stash
Jul  1 05:03:13 nas kernel: [3417829]     0 3417829   925313   536752   536619      133         0  4648960        0             0 ffmpeg
Jul  1 05:03:13 nas kernel: [3442366]     0 3442366   447181    40025    39928       97         0   692224        0             0 ffmpeg
Jul  1 05:03:13 nas kernel: [3442642]     0 3442642   453024    35749    35672       77         0   692224        0             0 ffmpeg
Jul  1 05:03:13 nas kernel: [3442691]     0 3442691   434473    35970    35805      165         0   679936        0             0 ffmpeg
Jul  1 05:03:13 nas kernel: [3442878]     0 3442878   433428    31035    30902      133         0   651264        0             0 ffmpeg
Jul  1 05:03:13 nas kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,mems_allowed=0,oom_memcg=/docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,task_memcg=/docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,task=ffmpeg,pid=3417829,uid=0
Jul  1 05:03:13 nas kernel: ffmpeg invoked oom-killer: gfp_mask=0xcc0(GFP_KERNEL), order=0, oom_score_adj=0
Jul  1 05:03:13 nas kernel: CPU: 11 UID: 0 PID: 3442366 Comm: ffmpeg Tainted: P           OE      6.12.15-production+truenas #1
Jul  1 05:03:13 nas kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [E]=UNSIGNED_MODULE
Jul  1 05:03:13 nas kernel: Hardware name: Supermicro Super Server/X11SDV-8C+-TLN2F, BIOS 1.2 11/14/2019
Jul  1 05:03:13 nas kernel: Call Trace:
Jul  1 05:03:13 nas kernel:  <TASK>
Jul  1 05:03:13 nas kernel:  dump_stack_lvl+0x64/0x80
Jul  1 05:03:13 nas kernel:  dump_header+0x43/0x160
Jul  1 05:03:13 nas kernel:  oom_kill_process+0xfa/0x200
Jul  1 05:03:13 nas kernel:  out_of_memory+0x257/0x520
Jul  1 05:03:13 nas kernel:  mem_cgroup_out_of_memory+0x12a/0x140
Jul  1 05:03:13 nas kernel:  try_charge_memcg+0x491/0x620
Jul  1 05:03:13 nas kernel:  __mem_cgroup_charge+0x42/0xd0
Jul  1 05:03:13 nas kernel:  do_anonymous_page+0x2db/0x7d0
Jul  1 05:03:13 nas kernel:  ? __pte_offset_map+0x1b/0x180
Jul  1 05:03:13 nas kernel:  __handle_mm_fault+0xb6d/0xfd0
Jul  1 05:03:13 nas kernel:  handle_mm_fault+0x180/0x2d0
Jul  1 05:03:13 nas kernel:  do_user_addr_fault+0x20c/0x670
Jul  1 05:03:13 nas kernel:  exc_page_fault+0x76/0x190
Jul  1 05:03:13 nas kernel:  asm_exc_page_fault+0x26/0x30
Jul  1 05:03:13 nas kernel: RIP: 0033:0x2c99a08
Jul  1 05:03:13 nas kernel: Code: 24 08 48 f7 d8 0f 18 04 02 41 0f 18 04 00 48 83 c0 40 7c f1 8b 44 24 08 48 f7 d8 f3 0f 7e 04 02 f3 41 0f 7e 0c 00 66 0f 60 c1 <66> 0f e7 04 47 f3 0f 7e 44 02 08 f3 41 0f 7e 4c 00 08 66 0f 60 c1
Jul  1 05:03:13 nas kernel: RSP: 002b:00007fffea0749f8 EFLAGS: 00010282
Jul  1 05:03:13 nas kernel: RAX: ffffffffffffff40 RBX: 0000000021799a40 RCX: 0000000000000140
Jul  1 05:03:13 nas kernel: RDX: 000000001f5eb200 RSI: 0000000000000300 RDI: 00000000217bd180
Jul  1 05:03:13 nas kernel: RBP: 000000001e4c8a10 R08: 000000001f5f93c0 R09: 0000000000000140
Jul  1 05:03:13 nas kernel: R10: 0000000000000021 R11: 0000000000000040 R12: 0000000000000002
Jul  1 05:03:13 nas kernel: R13: 000000001f8b62c0 R14: 0000000000000000 R15: 0000000000000002
Jul  1 05:03:13 nas kernel:  </TASK>
Jul  1 05:03:13 nas kernel: memory: usage 4006380kB, limit 4194304kB, failcnt 23459
Jul  1 05:03:14 nas kernel: swap: usage 0kB, limit 4194304kB, failcnt 0
Jul  1 05:03:14 nas kernel: Memory cgroup stats for /docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339:
Jul  1 05:03:14 nas kernel: anon 3339898880
Jul  1 05:03:14 nas kernel: file 348160
Jul  1 05:03:14 nas kernel: kernel 17985536
Jul  1 05:03:14 nas kernel: kernel_stack 3637248
Jul  1 05:03:14 nas kernel: pagetables 11313152
Jul  1 05:03:14 nas kernel: sec_pagetables 0
Jul  1 05:03:14 nas kernel: percpu 7848
Jul  1 05:03:14 nas kernel: sock 0
Jul  1 05:03:14 nas kernel: vmalloc 12288
Jul  1 05:03:14 nas kernel: shmem 0
Jul  1 05:03:14 nas kernel: zswap 0
Jul  1 05:03:14 nas kernel: zswapped 0
Jul  1 05:03:14 nas kernel: file_mapped 348160
Jul  1 05:03:14 nas kernel: file_dirty 0
Jul  1 05:03:14 nas kernel: file_writeback 0
Jul  1 05:03:14 nas kernel: swapcached 0
Jul  1 05:03:14 nas kernel: anon_thp 127926272
Jul  1 05:03:14 nas kernel: file_thp 0
Jul  1 05:03:14 nas kernel: shmem_thp 0
Jul  1 05:03:14 nas kernel: inactive_anon 3370606592
Jul  1 05:03:14 nas kernel: active_anon 24576
Jul  1 05:03:14 nas kernel: inactive_file 143360
Jul  1 05:03:14 nas kernel: active_file 8192
Jul  1 05:03:14 nas kernel: unevictable 0
Jul  1 05:03:14 nas kernel: slab_reclaimable 339024
Jul  1 05:03:14 nas kernel: slab_unreclaimable 2612264
Jul  1 05:03:14 nas kernel: slab 2951288
Jul  1 05:03:14 nas kernel: workingset_refault_anon 0
Jul  1 05:03:14 nas kernel: workingset_refault_file 15792
Jul  1 05:03:14 nas kernel: workingset_activate_anon 0
Jul  1 05:03:14 nas kernel: workingset_activate_file 4161
Jul  1 05:03:14 nas kernel: workingset_restore_anon 0
Jul  1 05:03:14 nas kernel: workingset_restore_file 3995
Jul  1 05:03:14 nas kernel: workingset_nodereclaim 0
Jul  1 05:03:14 nas kernel: pgdemote_kswapd 0
Jul  1 05:03:14 nas kernel: pgdemote_direct 0
Jul  1 05:03:14 nas kernel: pgdemote_khugepaged 0
Jul  1 05:03:14 nas kernel: pgpromote_success 0
Jul  1 05:03:14 nas kernel: pgscan 53433
Jul  1 05:03:14 nas kernel: pgsteal 27827
Jul  1 05:03:14 nas kernel: pgscan_kswapd 8656
Jul  1 05:03:14 nas kernel: pgscan_direct 43439
Jul  1 05:03:14 nas kernel: pgscan_khugepaged 1338
Jul  1 05:03:14 nas kernel: pgsteal_kswapd 1849
Jul  1 05:03:14 nas kernel: pgsteal_direct 25156
Jul  1 05:03:14 nas kernel: pgsteal_khugepaged 822
Jul  1 05:03:14 nas kernel: pgfault 14199970640
Jul  1 05:03:14 nas kernel: pgmajfault 62406
Jul  1 05:03:14 nas kernel: pgrefill 55866
Jul  1 05:03:14 nas kernel: pgactivate 32147
Jul  1 05:03:14 nas kernel: pgdeactivate 31459
Jul  1 05:03:14 nas kernel: pglazyfree 0
Jul  1 05:03:14 nas kernel: pglazyfreed 0
Jul  1 05:03:14 nas kernel: swpin_zero 0
Jul  1 05:03:14 nas kernel: swpout_zero 0
Jul  1 05:03:14 nas kernel: zswpin 0
Jul  1 05:03:14 nas kernel: zswpout 0
Jul  1 05:03:14 nas kernel: zswpwb 0
Jul  1 05:03:14 nas kernel: thp_fault_alloc 1861228
Jul  1 05:03:14 nas kernel: thp_collapse_alloc 55352
Jul  1 05:03:14 nas kernel: thp_swpout 0
Jul  1 05:03:14 nas kernel: thp_swpout_fallback 0
Jul  1 05:03:14 nas kernel: numa_pages_migrated 0
Jul  1 05:03:14 nas kernel: numa_pte_updates 0
Jul  1 05:03:14 nas kernel: numa_hint_faults 0
Jul  1 05:03:14 nas kernel: Tasks state (memory values in pages):
Jul  1 05:03:14 nas kernel: [  pid  ]   uid  tgid total_vm      rss rss_anon rss_file rss_shmem pgtables_bytes swapents oom_score_adj name
Jul  1 05:03:14 nas kernel: [ 134793]     0 134793  1500485   363902   363746      156         0  4243456        0             0 stash
Jul  1 05:03:14 nas kernel: [3442366]     0 3442366   447444    40153    40024      129         0   696320        0             0 ffmpeg
Jul  1 05:03:14 nas kernel: [3442642]     0 3442642   453024    35785    35672      113         0   692224        0             0 ffmpeg
Jul  1 05:03:14 nas kernel: [3442691]     0 3442691   434473    35970    35805      165         0   679936        0             0 ffmpeg
Jul  1 05:03:14 nas kernel: [3442878]     0 3442878   433428    31067    30934      133         0   651264        0             0 ffmpeg
Jul  1 05:03:14 nas kernel: oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,mems_allowed=0,oom_memcg=/docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,task_memcg=/docker/1c128e278230de632a2a58d0b2fbfe172b7e9581ce24360127a8999e3c154339,task=stash,pid=134793,uid=0
Jul  1 05:03:15 nas kernel: br-802150ac30fc: port 1(veth2ceae76) entered disabled state
Jul  1 05:03:15 nas kernel: vethed729b5: renamed from eth0
Jul  1 05:03:15 nas kernel: br-802150ac30fc: port 1(veth2ceae76) entered disabled state
Jul  1 05:03:15 nas kernel: veth2ceae76 (unregistering): left allmulticast mode
Jul  1 05:03:15 nas kernel: veth2ceae76 (unregistering): left promiscuous mode
Jul  1 05:03:15 nas kernel: br-802150ac30fc: port 1(veth2ceae76) entered disabled state
Jul  1 05:03:15 nas kernel: br-43f9e4610365: port 3(vethc345088) entered disabled state
Jul  1 05:03:15 nas kernel: veth5ee1cc9: renamed from eth1
Jul  1 05:03:15 nas kernel: br-43f9e4610365: port 3(vethc345088) entered disabled state
Jul  1 05:03:15 nas kernel: vethc345088 (unregistering): left allmulticast mode
Jul  1 05:03:15 nas kernel: vethc345088 (unregistering): left promiscuous mode
Jul  1 05:03:15 nas kernel: br-43f9e4610365: port 3(vethc345088) entered disabled state
Jul  1 05:03:15 nas kernel: br-802150ac30fc: port 1(veth7379ea1) entered blocking state
Jul  1 05:03:15 nas kernel: br-802150ac30fc: port 1(veth7379ea1) entered disabled state
Jul  1 05:03:15 nas kernel: veth7379ea1: entered allmulticast mode
Jul  1 05:03:15 nas kernel: veth7379ea1: entered promiscuous mode
Jul  1 05:03:15 nas kernel: br-43f9e4610365: port 3(veth6f46295) entered blocking state
Jul  1 05:03:15 nas kernel: br-43f9e4610365: port 3(veth6f46295) entered disabled state
Jul  1 05:03:15 nas kernel: veth6f46295: entered allmulticast mode
Jul  1 05:03:15 nas kernel: veth6f46295: entered promiscuous mode
Jul  1 05:03:15 nas systemd-journald[849]: Data hash table of /var/log/journal/8a01af4724e94c6eb329633cbf2dd551/system.journal has a fill level at 75.0 (8535 of 11377 items, 6553600 file size, 767 bytes per hash table item), suggesting rotation.
Jul  1 05:03:15 nas systemd-journald[849]: /var/log/journal/8a01af4724e94c6eb329633cbf2dd551/system.journal: Journal header limits reached or header out-of-date, rotating.
Jul  1 05:03:15 nas kernel: eth0: renamed from vethffae6bf
Jul  1 05:03:15 nas kernel: br-802150ac30fc: port 1(veth7379ea1) entered blocking state
Jul  1 05:03:15 nas kernel: br-802150ac30fc: port 1(veth7379ea1) entered forwarding state
Jul  1 05:03:15 nas kernel: eth1: renamed from veth3193367
Jul  1 05:03:15 nas kernel: br-43f9e4610365: port 3(veth6f46295) entered blocking state
Jul  1 05:03:15 nas kernel: br-43f9e4610365: port 3(veth6f46295) entered forwarding state

The memory was low but not zero


The replication tasks are still disabled.

I started to clean pool1 and free space is now 3 TB. Even though it is just around 90% it am not sure it this is the problem with this crash today?

Thanks

Warning - following is a workaround & not a fix:

echo  #SIZEINBYTESHERE# >> /sys/module/zfs/parameters/zfs_arc_max

ARC should dynamically reduce in size to ensure everything else in system has enough memory. This would manually set the limit (try out 50-75% limit).

This is not a fix, you’d still need to find the leak (might not necessarily be a memory leak). This would just give a bit more free space as ARC needs a few units of time to reduce itself & that might be the difference between whatever you’re doing finishing successfully so you can figure this out without crashing… or not.

Edit:

I’d start keeping top -o %MEM open to try and catch anything unusual…

2nd Edit:

This command won’t survive any reboots… I’ve also personally noticed that it also doesn’t survive starting/stopped any VMs.

1 Like