Monitoring NVMe health in TrueNAS (and possibly Netdata)

I’ve been utilizing a 128GB NVMe drive on my TrueNAS (formerly FreeNAS) system since it was assembled in 2016. I’m looking to add in two more 1TB NVMe drives as a mirrored vdev and migrate my TrueNAS apps from the slower hard drives to the faster NVMe drives while keeping my existing vdev pool of eight hard drives in a RAIDZ2 configuration. Since I’m running 25.10.0.1 now, this should be fairly simple.

One of my concerns is the drive health. I tried the command smartctl -a /dev/nvme0 which gave me the following report:

smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.33-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Number:                       SAMSUNG MZVPV128HDGM-00000
Serial Number:                      S1XVNYAGC01633
Firmware Version:                   BXW7300Q
PCI Vendor/Subsystem ID:            0x144d
IEEE OUI Identifier:                0x002538
Controller ID:                      1
NVMe Version:                       <1.2
Number of Namespaces:               1
Namespace 1 Size/Capacity:          128,035,676,160 [128 GB]
Namespace 1 Utilization:            127,785,336,832 [127 GB]
Namespace 1 Formatted LBA Size:     512
Local Time is:                      Thu Nov 20 13:42:20 2025 CST
Firmware Updates (0x06):            3 Slots
Optional Admin Commands (0x0007):   Security Format Frmw_DL
Optional NVM Commands (0x001f):     Comp Wr_Unc DS_Mngmt Wr_Zero Sav/Sel_Feat
Log Page Attributes (0x01):         S/H_per_NS
Maximum Data Transfer Size:         32 Pages

Supported Power States
St Op     Max   Active     Idle   RL RT WL WT  Ent_Lat  Ex_Lat
 0 +     9.00W       -        -    0  0  0  0        5       5
 1 +     4.60W       -        -    1  1  1  1       30      30
 2 +     3.80W       -        -    2  2  2  2      100     100
 3 -   0.0700W       -        -    3  3  3  3      500    5000
 4 -   0.0050W       -        -    4  4  4  4     2000   22000

Supported LBA Sizes (NSID 0x1)
Id Fmt  Data  Metadt  Rel_Perf
 0 +     512       0         0

=== START OF SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

SMART/Health Information (NVMe Log 0x02)
Critical Warning:                   0x00
Temperature:                        37 Celsius
Available Spare:                    100%
Available Spare Threshold:          10%
Percentage Used:                    11%
Data Units Read:                    5,405,440 [2.76 TB]
Data Units Written:                 7,782,300 [3.98 TB]
Host Read Commands:                 64,843,322
Host Write Commands:                1,143,226,878
Controller Busy Time:               2,252
Power Cycles:                       193
Power On Hours:                     82,056
Unsafe Shutdowns:                   61
Media and Data Integrity Errors:    0
Error Information Log Entries:      199

Error Information (NVMe Log 0x01, 16 of 64 entries)
Num   ErrCount  SQId   CmdId  Status  PELoc          LBA  NSID    VS  Message
  0        199     0  0x4005  0x4004  0x000            0     1     -  Invalid Field in Command
  1        198     0  0x000a  0x4004  0x000            0     0     -  Invalid Field in Command
  2        197     0  0x0013  0x4016  0x000            0     1     -  Invalid Namespace or Format
  3        196     0  0x0006  0x4004  0x000            0     0     -  Invalid Field in Command
  4        195     0  0x0019  0x4004  0x000            0     1     -  Invalid Field in Command
  5        194     0  0x000e  0x4004  0x000            0     0     -  Invalid Field in Command
  6        193     0  0x0013  0x4016  0x000            0     1     -  Invalid Namespace or Format
  7        192     0  0x0006  0x4004  0x000            0     0     -  Invalid Field in Command
  8        191     0  0x0009  0x4004  0x000            0     1     -  Invalid Field in Command
  9        190     0  0x0006  0x4004  0x000            0     0     -  Invalid Field in Command
 10        189     0  0x0013  0x4016  0x000            0     1     -  Invalid Namespace or Format
 11        188     0  0x0006  0x4004  0x000            0     0     -  Invalid Field in Command
 12        187     0  0x0019  0x4004  0x000            0     1     -  Invalid Field in Command
 13        186     0  0x000e  0x4004  0x000            0     0     -  Invalid Field in Command
 14        185     0  0x0013  0x4016  0x000            0     1     -  Invalid Namespace or Format
 15        184     0  0x0006  0x4004  0x000            0     0     -  Invalid Field in Command
... (48 entries not read)

Self-tests not supported

So, I’m guessing, based upon the Percentage Used field, that I have 89% of life left. Am i correct?

Since I have a Netdata license installed, is there a way to see this life remaining percentage in that container? I checked System → Storage → Disks and see my drive nvme0n1, but the information appears limited.

The “Scrutiny” app will do that for you. Provide a nice web UI where you can review vital parameters of all your drives.

1 Like

Ah…. I must have missed that app. Installed… thank you!