SMART testing configure & view in Goldeye

How can I view SMART testing in Goldeye, to configure test parameters such as HDD temperature alerts and also the results of SMART tests?

Thank you

This is explained in the release notes, please take some time perusing them:

1 Like

I think they took all of the configuration away. I too would like the option to configure what notifications levels etc but I don’t think that option exists any longer.

Thanks David, after searching the TrueNAS docs and finding nothing helpful I asked ChatGPT who told me that all the SMART GUI stuff has gone in Goldeye and that an app called Scrutiny would do the same, or better, job. Problem is that to install Scrutiny requires pre-installation configuration, which for a basic user is incomprehensible. Maybe someone on here can provide a step-by-step (literally every step, nothing assumed) installation guide?

I can only post the command to run the short and long test that got migrated to cron jobs which seem to work the same as the old smart gui setup

midclt call disk.smart_test LONG '["*"]'

midclt call disk.smart_test SHORT '["*"]'

They run as root and with a custom schedule…

1 Like

With a lot of help from ChatGPT I managed to install Scrutiny, only to discover it only reads SMART data and nothing can be configured using it. So back to CRON jobs; I thought software was supposed to get easier to use i.e. more GUI less Command line. Anyway if you want to read the help from ChatGPT here it is: ChatGPT - TrueNAS update train explained

ChatGPT help to setup SMART testing

The following chat link includes lots of scripts and CRON jobs to do the full Monty!

The full Monty!

2 Likes

Ahhh - that’s good that they still work. I’ve been asking the same thing, since the standard output to my email was “null”.

Installing Scrutiny didn’t require any extra steps for me. I basically left the host paths default (iXvolume) because I don’t see the need to back up it’s data. Installation completed after a minute and all my drives are immediately visible.

There doesn’t seem to be a way to get notified by mail, but at least it’s fairly easy to check the status of the drives.

If smart test are not performed, Scrutiny will just show old data!

1 Like

As I understand it, you need cronjobs + Scrutiny (Scrutiny look good btw) to be able to monitor disks.

So, what is in the release notes (25.10 (Goldeye) Version Notes | TrueNAS Documentation Hub, disk management tab): “You can edit, disable, or delete these cronjobs. If you install Scrutiny or another monitoring solution, disable these migrated cronjobs to avoid duplicate test scheduling” is incorrect.

You need both the cronjobs and scrutiny. Scrutiny only reads the test outputs somehow.

Installing Scrutiny is indeed piece of cake, look here by ServersatHome: Be Smart About SMART Monitoring With Scrutiny on TrueNAS!

Agreed on the offending sentence. In fact, the PR to fix it was open before I even read this :wink:

I don’t think that’s true. If you look at scrutiny/collector/pkg/config/config.go at master · AnalogJ/scrutiny · GitHub, it uses

	c.SetDefault("commands.metrics_smartctl_bin", "smartctl")
	c.SetDefault("commands.metrics_scan_args", "--scan --json")
	c.SetDefault("commands.metrics_info_args", "--info --json")
	c.SetDefault("commands.metrics_smart_args", "--xall --json")
	c.SetDefault("commands.metrics_smartctl_wait", 0)

So it’s doing passive monitoring by querying current SMART data from your disks and then it writes this to a database for historical trend monitoring. If you have SMART tests scheduled, I believe Scrutiny should gather these results as part of the smartctl query, meaning it can integrate this data into its metrics and trends, but tests aren’t required.

1 Like

I am confused now. Scheduling SMART tests is what you do with cronjobs in Goldeye, right? So, you need cronjobs to get the SMART test results and Scrutiny processes these (I fail in understanding the code mentioned)

I have to take a step back in my previous message because seems exactly how you point, depending on the frequency of the scrutiny collector those data are retrieved and collected.
But not sure why this is not true

scheduling SMART self-tests (via cron or smartd) is part of a active failure-prevention strategy. Scrutiny doesn’t perform these tests → it just collect data passively, is that considerable prevention? May i’m missing something obvious :smiley:

Scrutiny doesn’t process or show test results. Scrutiny collects the SMART Attributes reported by the drives. You don’t need to schedule smart tests to have information about “Power on hours” or “Unreadable sectors”.

I have not seen any evidence that scheduling regular SMART tests does anything useful. Reading all (used) sectors is already done via a ZFS Scrub. What the tests do is also vendor specific and happening in a black box, which makes it difficult to gauge their usefulness. All research I’ve seen focuses on the reported SMART Attributes and not on scheduling any tests.

The following is my understanding of how automatic, self initiated, and requested SMART testing works; any corrections welcome.

1: Most hard drives do automatic, self initiated, SHORT SMART testing on a schedule determined by their firmware

If an externally requested LONG TEST is running, then the self-initiated firmware SHORT TEST does not run when the firmware timer hits ZERO
and is re-scheduled (60 minutes seems to be the default) but will not run until the LONG TEST has completed

2: Hard drives can be externally requested to do LONG SMART tests, which can take several hours for large SAS drives

3: View SMART test data using e.g. sudo smartctl -x /dev/sda
sda in my system is a 6Gb SAS drives
number of minutes until next internal SMART test = 29 suggests that this drive runs SHORT SMART tests every hour, which probably takes just seconds
Self-test execution status: 90% of test remaining, this number reduces very slowly suggesting one of my CRON jobs has started a LONG TEST

4: The fact that drives do automatic, self initiated, SHORT SMART tests suggests that CRON scheduled SHORT tests are pointless
but it’s possible that requested SHORT tests may do more testing than the self initiated SHORT testing
but even if this is not true, as the SHORT tests are very quick, it may be prudent to CRON schedule additional SHORT tests

5: CRON scheduling SHORT or LONG tests is most elegantly done using a shell script e.g. smart_test.sh
A useful tip I discovered after hours of trying to create shell scripts that should run from CRON is:
This command will run from a shell prompt: root smartctl --test=long /dev/sda
but if included in a shell script that runs from a CRON job it will fail, with a CRON job error message
the reason is that the output from the test must be re-directed somewhere e.g. root smartctl --test=long /dev/sda >> /dev/null

6: For newbies, like me, reading this - shell scripts can be created from a shell prompt using nano smart_test.sh using Ctrl O + Enter to save and Ctrl X to exit the nano text editor

7: The SCRUTINY app, that can be installed to TrueNAS, just collects SMART data from the drives, however derived i.e. self-initiated or requested and maintains a database using InFluxDB
Scrutiny is a simple but focused application, with a couple of core features, which do not suggest that the app initiates any SMART testing:

						Web UI Dashboard - focused on Critical metrics
						smartd integration (no re-inventing the wheel)
						Auto-detection of all connected hard-drives
						S.M.A.R.T metric tracking for historical trends
						Customized thresholds using real world failure rates
						Temperature tracking
						Provided as an all-in-one Docker image (but can be
                        installed manually)
						Configurable Alerting/Notifications via Webhooks
						(Future) Hard Drive performance testing & tracking

8: After much work with ChatGPT the following script will auto-detect all TrueNAS drives, run SMART tests which can be configured to be SHORT or LONG, wait until testing a drive
has finished before starting the next drive and logging the test results to separate log files, and which will run correctly from a CRON job:

#!/bin/bash
# ======================================================================
# SMART self-test automation for ATA and SCSI drives (TrueNAS compatible)
# Logs saved per drive; only latest log for each drive is kept
# ======================================================================

# User settings
LOGDIR="/mnt/tank/smartlogs"
TEST_TYPE="short"        # Change to "long" if desired
POLL_INTERVAL=15         # Seconds between status checks

# Create log directory
mkdir -p "$LOGDIR"

# Find all ATA and SCSI drives dynamically (skip partitions)
DRIVES=$(find /dev/disk/by-id/ -maxdepth 1 -type l \( -name "ata-*" -o -name "scsi-*" \) ! -name "*-part*" | sort)

# Delete previous logs for these drives
for DRIVE_PATH in $DRIVES; do
    DRIVE_NAME=$(basename "$DRIVE_PATH")
    rm -f "$LOGDIR/${DRIVE_NAME}_*.log"
done

# Loop through drives and test each
for DRIVE_PATH in $DRIVES; do
    DRIVE_NAME=$(basename "$DRIVE_PATH")
    LOGFILE="$LOGDIR/${DRIVE_NAME}_${TEST_TYPE}.log"

    {
        echo "======================================================================"
        echo "SMART ${TEST_TYPE} test started for ${DRIVE_PATH} at $(date '+%F %T')"
        echo "----------------------------------------------------------------------"
    } | tee -a "$LOGFILE"

    # Start the test
    smartctl -t "$TEST_TYPE" "$DRIVE_PATH" >>"$LOGFILE" 2>&1

    # Poll until test completes
    while true; do
        STATUS_LINE=$(smartctl -a "$DRIVE_PATH" | grep "Self-test execution status")
        STATUS_CODE=$(echo "$STATUS_LINE" | grep -oP '\(\s*\K[0-9]+')

        if [[ -z "$STATUS_CODE" ]]; then
            echo "  Unable to determine test status for $DRIVE_PATH" | tee -a "$LOGFILE"
            break
        fi

        if [[ "$STATUS_CODE" -eq 0 ]]; then
            break
        fi

        echo "  Test still in progress for $DRIVE_PATH... (status code $STATUS_CODE)" | tee -a "$LOGFILE"
        sleep "$POLL_INTERVAL"
    done

    # Append summary
    {
        echo
        echo "SMART summary for ${DRIVE_PATH}:"
        smartctl -a "$DRIVE_PATH" | grep -E "Model Family|Device Model|Serial Number|Temperature|overall-health|Self-test execution status" || true
        echo "----------------------------------------------------------------------"
        echo "SMART ${TEST_TYPE} test completed for ${DRIVE_PATH} at $(date '+%F %T')"
        echo "======================================================================"
        echo
    } | tee -a "$LOGFILE"
done

When all the tests have completed the logs can be listed and viewed:

root@truenas[~]# ls -l /mnt/tank/smartlogs/                                              
total 27
-rw-r--r-- 1 root root 6572 Nov  7 12:18 ata-ADATA_SU650_2I3820068112_short.log
-rw-r--r-- 1 root root 4855 Nov  7 12:20 ata-KINGSTON_SA400S37120G_50026B77740468FF_short.log
-rw-r--r-- 1 root root 1499 Nov  7 12:20 ata-TSSTcorp_CDDVDW_SH-224BB_R8WS68DCB01TCD_short.log
-rw-r--r-- 1 root root 1785 Nov  7 12:20 scsi-35000c500850b1adb_short.log
-rw-r--r-- 1 root root 1713 Nov  7 12:20 scsi-35000c500997d47ab_short.log
-rw-r--r-- 1 root root 1713 Nov  7 12:20 scsi-35000c5009981a91f_short.log

Don’t redirect job output to /etc/crontab, that is an important file. Redirect it to /dev/null instead (i.e. command > /dev/null). On TrueNAS, you don’t have to interact with crontab directly. Instead, go to System > Advanced in the Web-UI and create cron jobs there.

Yes thanks, e.g. ChatGPT used >>“$LOGFILE" 2>&1

I have modified my post.

A long SMART test shall verify each and every sector of the drive, potentitally detecting a defective sector even before it gets used.

A ZFS scrub returning drive errors would be telling that drives should have been replaced earlier. Good to know, but not good enough.