How to replace a 2Tb SSD with a slightly smaller one?

lipe123 · August 14, 2024, 3:02pm

Using RaidZ1
I have 3 Silicon power 2TB drives that report size as 1.86 TiB
One of them keeps giving errors and going offline so I ordered a Crucial 2TB drive to replace it with but the Crucial reports as 1.82TiB and when I try and do replace I get the error that the disk size is too small.

What are my options?
Is there a way to shrink the used space on the other disks to 1.80TiB ?
I’m only using ~35% of the pool capacity so there should be lots of free space to adjust things.

zpool status

pool: DupPool
state: ONLINE
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using ‘zpool clear’ or replace the device with ‘zpool replace’.
see: Message ID: ZFS-8000-9P — OpenZFS documentation
scan: scrub in progress since Wed Aug 14 08:43:46 2024
1.49T / 1.49T scanned, 121G / 1.49T issued at 295M/s
0B repaired, 7.91% done, 01:21:32 to go
config:
    NAME                                      STATE     READ WRITE CKSUM
    DupPool                                   ONLINE       0     0     0
      raidz1-0                                ONLINE       0     0     0
        c423d6c1-30ab-4f86-95ca-325eb9354cec  ONLINE       0     0     2
        1c62c2f8-20fd-4416-807f-03ef98e8328c  ONLINE       0     0     0
        911e8a3b-930c-444a-b80d-11eedbc21f0f  ONLINE       0     0     0
errors: No known data errors

pool: boot-pool
state: ONLINE
scan: scrub repaired 0B in 00:01:26 with 0 errors on Sat Aug 10 03:46:27 2024
config:
    NAME        STATE     READ WRITE CKSUM
    boot-pool   ONLINE       0     0     0
      sdb3      ONLINE       0     0     0
errors: No known data errors

RetroG · August 14, 2024, 3:21pm

it’s unfortunately not possible to shrink a raidz vdev, you would have to have only mirror vdevs and you would have to take advantage of vdev removal. (which is a bit of a hack tbh)

you will need to create a new pool of a smaller size, send+receive the data, remove the old pool, renaming the pool to fix any mounts that referenced it.

this can be any set of disks in the mean time, even spinning disks if you have any lying around to connect (as long as is big enough to receive the data)

lipe123 · August 14, 2024, 3:39pm

Man that’s not fun.
Is it possible to copy the whole thing to a single large drive and then rebuild the pool with the new drive included and migrate it all back?

Protopia · August 14, 2024, 3:40pm

Yes - you can do that.

The issue is that whilst you think they are the same size, if you examine the actual number of blocks the new drive will be a few blocks less.

But if you move the data off and destroy the pool, you can definitely recreate it with the 3x 2TB drives and it will use the smallest of the drives as the basis for the pool size.

lipe123 · August 14, 2024, 3:49pm

No I’m very much aware that the Crucial drive is slightly smaller, there is always a decent amount of variance with SSD’s depending on overprovision and fault tolerance or whatever. Just never thought it would bite me in this particular way.
I don’t really have 2Tb of reliable storage right now to easily export/copy everything to.

Is there zpool terminal commands to do this whole process of:
Export to standalone disk.
Destroy pool and re-create with new smaller SSD?

oxyde · August 14, 2024, 4:32pm

You don’t need, 1tb drive will be more than enough (Ur data around 700gb right?).
The critical part is not loss data during the process (if another disk will fail you lose the entire pool).
You should replicate data before destroy the pool obv

Protopia · August 14, 2024, 8:31pm

I have now had the time to look at the zpool status output. The following advice may be too late, but if I had received that output I would have:

Been pretty happy that the scrub had found 2 checksum errors and (because of the redundant disk) safely corrected them.
Let the scrub complete to see what the outcome was. If the data was all recovered OK then…
Review SMART monitoring to see what the state of the drives are from that perspective and to try to work out what the underlying issue might be with the erroring disk.
Run an extended SMART test on the drive which was showing errors.

If the extended SMART test ran successfully, I might then clear the ZFS errors and leave the existing disk to see what happens next.

If the SMART test failed, I would change the disk out ASAP.

P.S. Implement @joeschmuck 's Multi-Report script so that you can get the diagnostic info and warnings sent to you at the earliest opportunity.

Stux · August 14, 2024, 9:59pm

Documented proof that SSDs can have bit rot…

If this issue keeps on occurring, then yes, time to replace.

Not sure if SATA or M.2 etc, but i would consider swapping things around (cables, positions etc) to verify that the issue is the drive and not something else.

Unfortunately, as discussed, can’t shrink a VDev.

But you can make a degraded raidz1, or even manually create a degraded raidz1 with two partitions from your new drive.

This would get you 1TB of non-redundant storage.

Replicate to that, then replace one of the partitions with a disk from the original raidz, then then the other partition, then finally the degraded partition with the now available new disk.

Protopia · August 15, 2024, 10:35am

Another potential way to use the new Crucial 2TB drive is to see whether there is a Crucial utility that can reduce the amount of over-provision. (Such utilities normally are for you to increase the over-provision, so it seems unlikely that they would allow you to reduce it below the standard level, but you never know.)

lipe123 · August 15, 2024, 2:59pm

Thanks for all the help, I’m pretty noob still with Linux and especially zfs so I really appreciate it!

Few updates to try and cover it all.
Actual disk space in use is just over 1TB, 1.26 or something like that, I’m guessing is from the VM and a few docker based apps that’s also taking space in addition to the regular SMB share.
I have access to some 1Tb SSD’s at my office I could borrow for a few days so my plan was to make a stripe that should give me 2TB to work with to replicate to.

Running the full SMART test now, will know more later but all drives are reporting no smart errors from the previous short tests.
What I saw happening was lots of write errors and then the disk was marked as offline. It also seems to not just be one disk, i saw similar behaviour on another disk but just unplugging it and reconnecting it, marking as online and scrubbing gave me no issues.

I’ll do some reading on how the migration steps work, it would really suck to have to rebuild everything from scratch as I have several docker apps running and a VM for a game server. Ideally just moving the whole pool around to other disks should allow all these things to keep working without starting over?

@Protopia the idea of messing with the overprovision size is brilliant, i totally forgot that some drives let you change that! If none of my other ideas work this might be the fastest fix of them all!

What has been the most frustrating in this whole process is that the serial numbers on the outside of the drives does not match the software serial numbers. Or whatever the giant barcode and number is on the external of the disks is not the serial. I had to power down the nas, pull the drives one at a time and look at the numbers with crystaldisk on my windows pc and then made my own labels to stick onto the disks.

Protopia · August 15, 2024, 3:23pm

It needs to be the LONG smart test. On an HDD, the short test tests just the basic operation of the drive, whilst the long test reads every sector to check that each and every one is readable and that any checksum is valid. On an SDD, the actual test will be different - you need to test the readability of each internal block and that checksums are valid, and maybe test that erased blocks read as zeros.

This is perhaps the most worrying aspect of the thread so far as it suggests to me that you might have cheap Chinese rip-offs rather than genuine drives.

lipe123 · August 15, 2024, 5:36pm

did the long/Extended one on all 3 drives with no errors:

They were super cheap and the brand is kinda… not great. I figured for home use it should be fine but regretting it now.
Do not buy silicon power drives, I bought some for work as scratch drives and they are so slow windows will just lock up for 2-3 seconds on iowaits until it starts to respond again.
Guessing i fell for the old get what you pay for and now I need to order better ones.