To be honest, without seeing Wolfgang’s video (Titled: " The Perfect Home Server 2024 – 56TB, ECC, IPMI, Quiet & (kind of) Compact"), I probably wouldn’t ever notice that I have one drive with SMR. Also, I have never noticed anything particularly wrong with my NAS. I had developed some symptoms after being up for 4-6 months, but usually a reboot fixes that.
My write and read speeds hover around 80-113MB/s depending on the file/files. My home network is 1 Gbps.
Truenas Scale version: ElectricEel-24.10.0
My question is, should I be worried about having one SMR drive? Should I switch out to a non-SMR drive? Am I losing some features or performance with an SMR drive?
Features, no. Performance, possibly, on sustained writes, and at some point when your pool will fill. The real issue is data safety, as SMR would make for a very long resilver; if you ever have a hardware issue, you’ll be at risk for an extended time.
The symptoms and a reboot are probably nothing to do with having an SMR drive.
Writes on an SMR drive are in two stages - first the writes are made to a small area of the disk (typically c. 30GB but the size is not in the disk specs), and when the drive is idle these are then written to the main SMR area of the drive, a much slower process. In other words these drives rely on relatively low usage to do this destaging and to get back the space in the CMR cache.
On sustained bulk writes, the CMR cache fills up completely, and then the drive slows to a crawl, and you would see this in normal operation only if that drive was doing sustained writes of > 30GB as part of the whole pool. So on a 4x RAIDZ1 you can probably write up to 90GB of user data before this happens.
So, under normal low load circumstances, you would never notice the difference. Reads throughput and iops are about the same, and for short bursts of writes throughput and iops are also about the same.
However when expanding a RAIDZ vDev, or resilvering a disk, ZFS does bulk extended bulk writes, and what would take hours or days on a CMR drive end up taking days or weeks on an SMR drive.
The worst time to find out about these performance issues is when you have a drive failure. My advice would be to swap the Baracuda drive out for another matching WD Red Plus drive as soon as you reasonably can.
Make note of the SDA/B/C/D/etc. naming of the disks & their serial numbers (the SD# can change after a boot) - identify the new drive & the SMR drive. Double check that you have correctly identified them. Write down the sd# name of each. Double check you wrote it down correctly.
Go back to the ‘Storage’ tab
Find your pool & click ‘Manage Devices’
Expand the vdev (It’ll say ‘RAIDZ1’ in your case)
Find the old drive in the list, referring back to your notes from earlier, click on it
There will be a button labelled ‘Replace’, click it
Find the new replacement disk in the replacement window that pops-up; select the replacement disk
Click ‘Replace Disk’ (shouldn’t be a need to have ‘Force’ checked)
Wait for resilver to complete
Shut down the system
Remove old disk
*Optional: power system back on
…It is a lot of steps when you really write them out like that, but should take like 5 minutes if you’re taking it slow (not accounting for resilver time)
Arguably Step 0 would be to burn-in the replacement drive & make sure it doesn’t suck with a smart long test, a pass of backblocks, and another smart long test. There are guides on the forum better than what I’d quickly list off here.
The first is the most important as it directly ties the disk model to both the device name /dev/sdX and the partition UUID that is displayed in the zpool status and the TrueNAS UI.
I hope this helps ensure that you replace the correct drive.