Last month I made the jump from Electric Eel to Goldeye following the upgrade guide in the documentation. It went buttery smooth. About a week later, I received a ‘failed a SMART selftest’ alert for one of my mirrored boot drives. Before I had even noticed the alert, it had cleared itself. As soon as I got home, I pulled the smartctl log and it had no record of any failures… odd, I thought. And went about my day. ~36 hours later, the same alert to the other mirrored boot drive… Both times, the alert cleared itself exactly 90 minutes later. This behavior has continued per the table below. What’s going on here?
Run this command smartctl -x /dev/nvme0 > nvme0_smartctl.txt
Run this command nvme self-test-log --output-format=json /dev/nvme0 > nvme0_stl.txt
Run this command echo "==================================" >> nvme0_stl.txt
Run this command nvme error-log /dev/nvme0 >> nvme_stl.txt
This will generate two files: nvme0_smartctl.txt which is what smartctl reports. And nvme0_stl.txt which is what the nvme command reports, both the self-test log and the error log.
Post both files here or feel free to send them to my email joeschmuck2023@hotmail.com.
These should shed some light on the issue, or possibly non-issue.
@joeschmuck I went through your Drive Troubleshooting guide again and still did not find any cause for concern. If you agree, I’ll raise this as a bug report for TrueNAS SCALE 25.10.