I had started receiving SMART warnings on one of my disks, so today I figured I would take care of that. I also figured I would do an upgrade before I tackled that in case any odd issues were present. I went from 13 to the latest 13… I am on 13.0-U6.8 now. Upon the restart my BIOS halted on an F1 due to a bad disk. I hit F1 and got past that. TrueNas shows all the disks online, but the errors for ATA2 still persisted. I have a spare disk to replace any failed ones.
So like other times, I popped in to the disks to take the bad one offline. I hit offline… and it spins forever, well at least an hour so far.
the log is showing nothing about this it seems except for errors from that device
Jul 26 13:02:55 freenas smartd[1553]: Device: /dev/ada2, FAILED SMART self-check. BACK UP DATA NOW!
Jul 26 13:02:55 freenas smartd[1553]: Device: /dev/ada2, Failed SMART usage Attribute: 5 Reallocated_Sector_Ct.
If this disk is so bad I can’t get to an offline failed, can I power down, replace it and bring the whole volume back up?
I am not sure at all what to do here…
As an aside, in case it makes a difference, after the upgrade I could not start my virtual machines either.
[EFAULT] VM will not start as DISK Device: /dev/zvol/Main_Volume/VMStorage/AlpineDock-g6xlgd device(s) are not available.
Any advice, or at least a “its going to be alright” would be great!
I think that without knowing a little more about your system – drive configurations and such, it’s hard to weigh-in on what is likely happening or how to diagnose it.
I suggest that you add a detailed description of your configuration if you want people to dig in and help – disks, memory. What is your raid config? How much redundancy?
Well, that is part of the issue… once I issued the take disk offline, I don’t have access to any pool information, disk information, etc… nothing.
From memory though, I have 64GB of memory in the server. I have 6 disks, and I recall from designing it long ago I could lose two of them and still run… they are all SATA drives. I am sure they are striped, not mirrored.
If you are sure you had a Raid-Z2 configuration. You should be able to power down and physically replace the disk. Did you note the drive serial number of the bad disk? You need to remove the failing or failed disk and only that one.
Thanks. I got ansy and sent a restart. After the long coming back up I have access to setups again. It is definitely RaidZ2 and I do have the serial number already.
I want to get another backup of some of my data before trying this. I did find that my main volume was not mounted to mnt for some reason, and even after i got it mounted, it is not being found by either SMB shares or the virtual machine disk entries. Sigh.
It took a long time, but they did appear after I manually mounted them.
That’s likely because the pool is not mounted to /mnt.
Would you care to share the outputs, formatted using the </> button of ? zpool status camcontrol devlist smartctl -x /dev/ada# for all the relevant numbers #
Also indicate what’s your motherboard (how many SATA ports?), whether you’re using a HBA (or a “SATA card”), ECC RAM, etc.
I’ve replaced the bad disk and now it is currently re-slivering. I had a problem with the pool not mounting in /mnt, but I had gotten around that. The VM’s were looking in /dev/zvol which did not have the device in there for the pool.
But, now that the new drive is in, the array went from ONLINE to DEGRADED and is incorporating the new disk. It also placed the relevant volumes in the /dev/zvol directory.
I THINK I AM GOOD!
Thank you!
ZPOOL STATUS
errors: No known data errors
root@freenas:~ # zpool status
pool: Main_Volume
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Sat Jul 26 15:31:07 2025
1.79T scanned at 6.63G/s, 261M issued at 966K/s, 11.9T total
0B resilvered, 0.00% done, no estimated completion time
config:
NAME STATE READ WRITE CKSUM
Main_Volume DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
gptid/b7650a52-483a-11e8-b678-ac1f6b83258e ONLINE 0 0 0
gptid/b8121849-483a-11e8-b678-ac1f6b83258e ONLINE 0 0 0
replacing-2 DEGRADED 0 0 0
11565328451382339695 UNAVAIL 0 0 0 was /dev/gptid/b8c51a1d-483a-11e8-b678-ac1f6b83258e
gptid/de680275-6a56-11f0-9df7-ac1f6b83258e ONLINE 0 0 0
gptid/b9749884-483a-11e8-b678-ac1f6b83258e ONLINE 0 0 0
gptid/fbeb27ac-5189-11ec-aeba-ac1f6b83258e ONLINE 0 0 0
gptid/bad9a8ad-483a-11e8-b678-ac1f6b83258e ONLINE 0 0 0
errors: No known data errors
pool: freenas-boot
state: ONLINE
status: Some supported and requested features are not enabled on the pool.
The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
scan: scrub repaired 0B in 00:16:42 with 0 errors on Fri Jul 25 04:01:42 2025
config:
NAME STATE
CAM CONTROL DEVLIST
da1p2 ONLINE 0 0 0
errors: No known data errors
root@freenas:~ #
root@freenas:~ # camcontrol devlist
<WDC WD60EFRX-68L0BN1 82.00A82> at scbus0 target 0 lun 0 (ada0,pass0)
<WDC WD60EFRX-68L0BN1 82.00A82> at scbus1 target 0 lun 0 (ada1,pass1)
<WDC WD60EFAX-68JH4N1 83.00A83> at scbus2 target 0 lun 0 (ada2,pass2)
<WDC WD60EFRX-68L0BN1 82.