TrueNAS locking up on large file transfers

I have 5 Dell PowerEdge R350 servers, each with a Zeon E-2314 CPU (2.80GHz), 8GB RAM, a pair of 256GB SSDs on the BOSS-S2 card for the OS, and a PERC H755 with 4x 2TB 6Gb SATA drives (Seagate ST2000NM012B-2TD) in JBOD (RAIDz1). Running TrueNAS Core 13.0-U6.7.

It seems like, whenever I have a large file transfer, it slowly grinds to a halt and locks up, the screen shows a bunch of plugin_dispath_values low water mark reached dropping 100% of metrics errors.

Once it’s dead, I have to do a warm reset, followed by hours of zio_deadman(): zio_wait waiting for hung I/O to pool messages. After that, the system comes back up as expected.

All 5 of my devices are doing this, and have done this with some regularity since the day I set them up.

What logs do I need to check, to see what the heck is going on?

Thanks!

One source indicated that the drive was a SMR type. If so, that would explain it. However, another source indicated that drive was CMR.

So, I don’t know.

Have you checked the firmware versions

Fixes:
Fix an issue that can cause format corrupt on 4160 sector size.
Fix issues that can cause assert.
Fix issues that can cause drive-hang.
Fix issues that can make firmware download unsuccessful.
Fix an issue that can cause drive failure reported.
Fix issues that can cause command timeout.

1 Like

According to Seagate, they (Exos) are CMR.