TrueNAS failed to read SMART values and Self-Test Log Failed

System is as follows:
Dell Poweredge R740 16x2.5" chassis
Xeon Gold 6150 x2
128g DDR4
Dell HBA330 12g Non-RAID controller
600gb 10k 12gbps SAS x 12
Dell 200g SATA SSD boot drive

In truenas I am getting smart value errors on 4 of the drives looks like and self-test log failed. After a clean installation I believe I got them on all drives, however I did a sg_format --size=512 on several of them that were showing 0gb from running lsblk.

Here is one output from smartctl -a /dev/sdl

truenas_admin@truenas[~]$ sudo smartctl -a /dev/sdl
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Vendor:               TOSHIBA
Product:              AL15SEB060NY
Revision:             EF06
Compliance:           SPC-4
User Capacity:        600,127,266,816 bytes [600 GB]
Logical block size:   512 bytes
Rotation Rate:        10000 rpm
Form Factor:          2.5 inches
Logical Unit id:      0x5000039ba80159c1
Serial number:        5240A0MPFQWF
Device type:          disk
Transport protocol:   SAS (SPL-4)
Local Time is:        Tue Aug 19 07:14:07 2025 PDT
SMART support is:     Available - device has SMART capability.
SMART support is:     Enabled
Temperature Warning:  Disabled or Not Supported

=== START OF READ SMART DATA SECTION ===
SMART Health Status: OK

Current Drive Temperature:     25 C
Drive Trip Temperature:        65 C

Accumulated power on time, hours:minutes 14699:07
Manufactured in week 18 of year 2022
Specified cycle count over device lifetime:  50000
Accumulated start-stop cycles:  39
Specified load-unload count over device lifetime:  600000
Accumulated load-unload cycles:  349
Elements in grown defect list: 0

Error counter log:
           Errors Corrected by           Total   Correction     Gigabytes    Total
               ECC          rereads/    errors   algorithm      processed    uncorrected
           fast | delayed   rewrites  corrected  invocations   [10^9 bytes]  errors
read:          0        0         0         0          0     791187.061           0
write:         0        1         1         1          1      36644.179           0
verify:        0        0         0         0          0      52108.426           0

Non-medium error count:   405759

SMART Self-test log
Num  Test              Status                 segment  LifeTime  LBA_first_err [SK ASC ASQ]
     Description                              number   (hours)
# 1  Background short  Completed                   -   14578                 - [-   -    -]
# 2  Reserved(7)       Completed                  80       3                 - [-   -    -]
# 3  Background short  Completed                   -       1                 - [-   -    -]

Long (extended) Self-test duration: 3116 seconds [51.9 minutes]

truenas_admin@truenas[~]$

I have looked over several posts here and other places, but it appears since I am not too experienced in this type of troubleshooting I need assistance :slight_smile:

Thanks!

When there are multiple drives with issues, its worth considering whether the SAS/SATA expander is the issue.

The puzzling part of report was:

Non-medium error count: 405759

Hey Captain_Morgan,
What does that error count usually point toward? Also, since I have formatted them with a 512 block size there have been 0 warnings or errors and it has been 4 or 5 days.

Thanks!

No idea, but I just thought it was an indicator of underlying issue across multiiple drives. It looks like the 512B block size format has solved the issue. (may have been drive firmware).

Well done! I’ll mark this as solved.

Ahh, ok! Yes, it does appear that has fixed the issue. I did have to format all the drives, not just those 4 :slight_smile:
Ty!
Kevin

You may never know, but the HBA could have caused the issue and needed the drives to be 512B formatted.

1 Like