Is pool auto-expansion just broken by design now?

dan · June 2, 2025, 3:55pm

Ever since FreeNAS 8.0, pools have auto-expanded when you replace all disks in a vdev with larger disks. There’s been nothing additional the user has needed to do (even though the manual, for a long time, had a section about how to check for that property on a pool, and enable it if it was disabled). But with SCALE, this has stopped. Replacement disks, regardless of size (so long as they were larger than the disk they were replacing), are partitioned with the size of the disk they’re replacing, with the rest of the space going to waste. Many threads have been opened (e.g., TrueNAS Scale - Auto Expand ZFS Pool Issues, Expand mirror?, No Capacity Expansion After Disk Upgrade), many bugs have been reported and closed as resolved, and yet the problem persists in 25.04.1.

Why is this? I can’t believe that with the release of SCALE, you’ve suddenly forgotten how to appropriately partition a replacement disk, so I assume this has to be a deliberate design decision–but why?

winnielinnie · June 2, 2025, 4:06pm

I noticed the same thing, and some of it is interrelated with the “disk not large enough” blunder ever since the 2GB swap at the beginning of newly added drives was no longer being created.

I thought that since 24.10.2 and 25.04 the issue of not using the entire drive capacity (minus 2GB for buffer) had been fixed? I would guess this would have also resolve the autoexpand issue?

Perhaps you’re still seeing this because your pool was created before the release of 24.10.2, where replacement drives had a partition created on them that equaled the old drive being replaced?

dan · June 2, 2025, 4:14pm

Nope.

The pool was created long before the release of 24.10.2, but a newly-replaced (currently resilvering) drive is partitioned this way:

sdaf         65:240  0  16.4T  0 disk
└─sdaf1      65:241  0   5.5T  0 part

Note also that there’s no swap/buffer partition created.

kris · June 2, 2025, 4:35pm

It went into 25.04.0 here:

https://ixsystems.atlassian.net/browse/NAS-134309

github.com/truenas/middleware

NAS-134309 / 25.04.0 / Make data partition slightly smaller than the disk (by themylogin)

stable/fangtooth ← NAS-134309-25.04.0

opened 11:12AM - 28 Feb 25 UTC

bugclerk

+186 -40

Reserve 2 GiB of disk space (but no more than 1%) to allow the data disk to be r…eplaced with a slightly smaller one in the future. I removed the usage of pyparted because it was creating partitions that were smaller than requested, and despite significant effort, I was unable to resolve the issue. We had agreed to remove this library if it continued to cause trouble, and that moment has now arrived. Original PR: https://github.com/truenas/middleware/pull/15825 Jira URL: https://ixsystems.atlassian.net/browse/NAS-134309

dan · June 2, 2025, 4:36pm

A 5.5 TiB partition on a 16.4 TiB disk is not “slightly smaller.” And this is on 25.04.1.

winnielinnie · June 2, 2025, 4:37pm

I wonder if that fix only applies to new vdev and pool creations?

Is another fix needed to apply the same logic for drive replacements?

dan · June 2, 2025, 4:41pm

Let’s not conflate issues here. I don’t see a swap/reserved partition on the replacement disk, but that isn’t my concern. My concern is that a 18 TB has a 6 TB data partition on it (possibly 6 TB - a bit), and 12 TB completely unused. And this has been the case for the last several SCALE releases.

It’s easy enough to fix, of course; that’s why I wrote Manual disk replacement in TrueNAS SCALE | Dan's Wiki. But the issue has persisted for, I believe, every release of SCALE, despite multiple tickets being reported and closed as resolved, so I’m thinking it has to be deliberate–and I can’t fathom why iX would make such a boneheaded design decision.

winnielinnie · June 2, 2025, 5:00pm

I think they’re interrelated.

My understanding was this new method of creating vdevs and replacing disks would resolve any issues with partitioning, “drive too small”, and auto expansion. The paradigm is completely redone.^[1]

If a new vdev is created, then TrueNAS would partition each drive to the maximum of the drive’s capacity minus 2GB of buffer at the end.

If a larger drive is replacing a smaller one, it will do the same thing with this new drive.

If the “same size” drive is replacing an existing drive, it will try to do the same, but if it cannot, then it will create the partition with a slightly larger size (eating into what the existing 2GB buffer allows.)

In either case, the pool should auto expand when all drives are replaced with larger ones, since each new ZFS member was not restricted to the older partition sizes.

“supposedly” ↩︎

dan · June 2, 2025, 5:03pm

This demonstrably is not happening. This 18 TB disk (and five more like it) is replacing a 6 TB disk, and that’s the size of the data partition that’s created on it.

winnielinnie · June 2, 2025, 5:05pm

That’s why I used a crying emoji earlier.

I thought the fix in 24.10.2 resolved this as well. It appears the logic only applies to new vdev creation. Another Jira ticket is needed I guess.

dan · June 2, 2025, 5:14pm

How many releases will it take iX to fix this? It’s been so many already that I’m concluding this is deliberate, a question which @kris hasn’t addressed.

On the off chance that the middleware will do something magical once the resilvering finishes, I’ll wait for that. If not, I guess I’ll submit another one.

winnielinnie · June 2, 2025, 5:27pm

In exactly 10 more release it will definitely be fixed.

Goldeye
H
I
J
K
L
M
N
O
Puffer Fish

“Your pools will now auto expand!”

etorix · June 2, 2025, 5:49pm

For the user, fixing it takes the time to revert to CORE… Just sayin’

kris · June 2, 2025, 6:06pm

github.com/truenas/middleware

src/middlewared/middlewared/plugins/pool_/expand.py

c2704aad6


      
              for topology_type in filter(
                  lambda t: t not in ('spare', 'cache') and pool['topology'][t], pool['topology']
              ):
                  for vdev in pool['topology'][topology_type]:
                      for c_vd in filter(
                          lambda v: v['guid'] in vdevs, vdev['children'] if vdev['type'] != 'DISK' else [vdev]
                      ):
                          await self.middleware.call('zfs.pool.online', pool['name'], c_vd['guid'], True)
          
          @private
          async def expand_partition(self, part_data):
              size = await self.middleware.call('disk.get_data_partition_size', part_data['disk'])
              if size <= part_data['size']:
                  return
          
              # Wipe potential conflicting ZFS label
              wipe_size = 1024 ** 2
              wipe_start = part_data['start'] + size - wipe_size
              if wipe_start < part_data['end']:
                  return

Or you could just try the “Expand pool option” in the UI which should trigger all this code to handle the re-sizing of partitions and whatnot on-demand.

dan · June 2, 2025, 6:15pm

Good to know what that button’s for. But this is now twice you’ve avoided a very simple question: is this new behavior by design? And if so, why?

kris · June 2, 2025, 6:34pm

Current behavior is intentional by design. It’s been a long while since I personally looked at the specific logic behind it so I had to go ask internally. IIRC, there were issues with the whole “autoexpand” behavior previously that could led to some esoteric support edge cases and this was the preferred approach that has the least risk to customers shooting their foot off.

dan · June 2, 2025, 7:52pm

So the answer is, “yes, yes it is.” Pools no longer auto-expand. All hail our new Linux overlords.

Stux · June 2, 2025, 11:58pm

I reported a bug that auto expand does not work as documented.

Was closed as “not applicable”, and told to re-read the incorrect documentation.

https://ixsystems.atlassian.net/browse/NAS-131904

etorix · June 3, 2025, 6:43am

Okeyyy… To make it obscure^[1], confusing and user-unfriendly must be The Linux Way.

as in “Read the Source, Luke”, three posts above ↩︎

kris · June 3, 2025, 12:39pm

Talk about folks with the “worst possible take”. We made the decision to change this behavior because of bugs we encountered on BSD version with Enterprise customers. Sheesh guys, your bias is showing