[EZFS_BADDEV] cannot replace: device is too small

One of my drives died, so I got a replacement now. The exact same drive, but it says that it’s smaller.

[EZFS_BADDEV] cannot replace 8942596345900405576 with /dev/disk/by-partuuid/150dbce1-f5df-4f29-be09-e9e55b2ee1ac: device is too small

I checked, and their size are identical
image
sdd is the replacement drive. I tried using the force option

1 Like

Upon wiping the drive sudo wipefs -a /dev/sdd it returns like this:

NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
sdd 8:48 0 4000787030016 0 disk

but after replacing in TrueNAS it creates this partition

sdd 8:48 0 4000787030016 0 disk
└─sdd1 8:49 0 1998252409344 0 part

Which is a lot smaller than the other partitions at 3998639460352 bytes.

I assume sdd is in the same slot as the failed drive?

How are you trying to replace it? UI? Or command line and if so what command are you using?

Also, please post the output of zpool status -v.

Not sure. It might be in a different sata port.
I’m trying to replace this drive, which is already removed


I’m doing it through the UI

zpool status -v

root@truenas[~]# zpool status -v
pool: MainBackup
state: ONLINE
scan: scrub repaired 0B in 05:27:34 with 0 errors on Sun Aug 18 05:27:46 2024
config:

    NAME        STATE     READ WRITE CKSUM
    MainBackup  ONLINE       0     0     0
      sdc2      ONLINE       0     0     0

errors: No known data errors

pool: MainPool
state: DEGRADED
status: One or more devices has been taken offline by the administrator.
Sufficient replicas exist for the pool to continue functioning in a
degraded state.
action: Online the device using ‘zpool online’ or replace the device with
‘zpool replace’.
scan: scrub repaired 1M in 10:02:21 with 0 errors on Wed Aug 21 11:30:17 2024
config:

    NAME                     STATE     READ WRITE CKSUM
    MainPool                 DEGRADED     0     0     0
      raidz1-0               DEGRADED     0     0     0
        8942596345900405576  OFFLINE      0     0     0  was /dev/sdc2
        sda2                 ONLINE       0     0     0
        sde2                 ONLINE       0     0     0

errors: No known data errors

pool: SSD
state: ONLINE
scan: scrub repaired 0B in 00:00:51 with 0 errors on Sun Jul 28 00:00:51 2024
config:

    NAME        STATE     READ WRITE CKSUM
    SSD         ONLINE       0     0     0
      sdb2      ONLINE       0     0     0

errors: No known data errors

pool: boot-pool
state: ONLINE
status: One or more features are enabled on the pool despite not being
requested by the ‘compatibility’ property.
action: Consider setting ‘compatibility’ to an appropriate value, or
adding needed features to the relevant file in
/etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d.
scan: scrub repaired 0B in 00:00:11 with 0 errors on Wed Aug 28 03:45:13 2024
config:

    NAME         STATE     READ WRITE CKSUM
    boot-pool    ONLINE       0     0     0
      nvme0n1p2  ONLINE       0     0     0

errors: No known data errors

I think I got it working. Not sure what did, but I suspect that TrueNAS was trying to make a partition, but since the drive was MBR it capped out at 2TB. After running sudo fdisk /dev/sdd and creating a GPT partiton (g, then n) I was able to make a 4TB partition (sdd1). However, upon replacing in the UI it did the same again? Dropping the command in CLI seemed to do the trick this time.

sudo zpool replace MainPool 8942596345900405576 /dev/sdd1

Now it’s resilvering.

I’ll echo your avatar with :thinking: and state that you shouldn’t need to have dropped to the CLI and manually created a GPT partition to fix it.

Can you use the “Send Feedback” smiley-face in the top right of the webUI to submit a bug for this, and leave the box checked to attach a “debug”? It should be able to capture the previous attempts by middleware and the webUI to do the disk wipe.

Yes - UI should have been able to do this - I am not sure what would have happened without the 4TB partition i.e. with an empty GPT disk/partition table, but with an existing partition the UI should have warned you about existing data on the disk and asked you if “you wanted to continue, losing any existing data”.

I had a power loss. Will I still be able to do this?

It should still be in the long-term middleware debug logs. If you have an approximate date and time of when you attempted to perform the expand/replace/wipe actions through the webUI it will let the team here search for them quickly.

@Askejm : created an account just to say thanks! was stuck on this but your solution (through CLI) worked for me as well …its busy resilvering now :slight_smile:

1 Like

Just throwing this out there as a potential lead.

Myself, and others in the past, have experienced a strange issue where the partitions are ‘out of order’ and/or based on the size of a smaller disk. I will try to explain what happened based on a foggy memory.

In the past I upgraded from smaller disks to larger disks doing the 'ol swaparoo (replace, resilver, one at a time). Most of them went fine. One of them gave me the same ‘too small’ error. When I reviewed the partitions, I noticed all the other disks had swap as the first partition, and data as the second partition – except the disk that was giving an error. In that case, the first partition was the data partition that was sized ~(size of old, smaller disk), and the second partition was the swap, leaving a lot of space unallocated.

From the screenshot, it looks like sdd had the first partition as the ‘large’ one – very similar to the size of partition sdc2. Perhaps it’s just formatted as MBR, but I always assumed the ‘total drive size’ is used and the drive is reformatted/partitioned anyway.

1 Like

Well not sure what this would’ve been, but after submitting the ticket iXsystems said this

Unsupported development tools capable of modifications to the base operating system have been enabled in this configuration. We are closing this ticket as it is outside the scope for effective issue investigation.

To focus investigation on finding and fixing issues with the default TrueNAS OS, please open a new ticket if you are able to reproduce the behavior on a TrueNAS install that has not installed OS development tools.