Is pool auto-expansion just broken by design now?

I have been using ZFS for longer than FreeNAS / TrueNAS has and I have never wanted auto-expansion enabled. When I am moving drives around to a new configuration, the last thing I want is a zpool expanding because I happened to use a set of slightly larger drives as a temporary place to house my data.

Taking the extra step to click expand once I have completed all the drive moves is a small price to pay to avoid possible unintended consequences. This default behavior is not a big deal.

Yes, the documentation needs to be fixed, but that also takes time.

5 Likes

In what way, other than the obvious that you don’t like it? Pool auto-expansion doesn’t work and hasn’t for at least the last four major releases of SCALE–thus, it’s broken. You’ve confirmed that the breakage is deliberate. How then should I describe it, if not “broken by design”?

…but somehow have never been motivated to fix in the BSD version–which continues, to this day, to auto-expand pools, leaving those Enterprise customers vulnerable to whatever those bugs are. Bugs in OS 1 lead to changed behavior in OS 2, but not in OS 1. I’m sorry, but this explanation makes no sense.

I’d wager you don’t represent one user in a hundred, and probably not one in a thousand. But something like zpool set autoexpand=off poolname would address this (as I’m sure you already know).

How long? I’ve observed this behavior in the last four major releases of the product. I suspect, but can’t confirm, that it’s been there since the first release of SCALE. Numerous tickets have been filed on this issue over the course of the past two years or so, any of which would make it obvious that the docs don’t match the product. What will it take for iX to fix their docs?

You’re right that it isn’t that big of a deal to hit the Expand button, but that isn’t the way the product has worked for 15 years. That’s a pretty significant change in behavior to go completely unmentioned in any sort of documentation.

1 Like

Possibly filing a bug against documenation rather than against TrueNAS itself, as converting the latter into the former is obviously a very costly operation in Jira.

Bug is fixed in OS 1 by sidegrading to OS 2. I think we’ve seen this answer a few times already.

Yes, I’m biaised towards irony.

It’s not a costly operation at all, but considering @Stux’s last two comments on that ticket were after bug clerk had already closed it, I suspect no one on staff actually read the request to reopen as a documentation issue.

Incidentally, I’ve opened a docs PR to correct the offending line this morning.

For anyone reading the thread who is not already aware of it (as I’m pretty sure @Stux and @dan are) you can always file a documentation issue with the Feedback button to report a docs issue or even use the Edit Page button to propose a correction yourself. We’re always happy to see community contributions :hugs:

3 Likes

Indeed. But so much for “we’ll support BSD for our paying customers for as long as they want it.”

You realize we do have to evaluate the risk of further breakages on changes like this right? We found edge case on BSD side. We decided to change default behavior in new edition of the product to avoid this somewhat rare, but real risk in the future.

You always have to do these risk vs benefit calculations. In this case it was rare enough to not want to risk going back and screwing up the older product line by dropping a new behavior in to a very stable code base. But annoying enough that we wanted to correct it in the new product line. Every serious business asks these questions all the time. Sometimes we take the risk and backport, because it was serious enough. Other times we do not.

1 Like

In this case, I’m rather happy you didn’t backport the “fix” to CORE… :shark:

1 Like

Confess I haven’t read the whole thread in detail but I’ve dropped in some new disks recently so it got me curious.

Original layout: 6x 12TB, set up on Core, most of the disks still look like this:

sde      8:64   0  10.9T  0 disk 
├─sde1   8:65   0     2G  0 part 
└─sde2   8:66   0  10.9T  0 part

One disk replaced more recently under SCALE (24.10 I think) looks like this, I guess this illustrates your point @dan:

sdi      8:128  0  14.6T  0 disk 
├─sdi1   8:129  0     2G  0 part 
└─sdi2   8:130  0  10.9T  0 part

HOWEVER one even more recent disk (also 24.10) looks like this:

sdc      8:32   0  16.4T  0 disk 
└─sdc1   8:33   0  16.4T  0 part

So then not sure what the actual expected behaviour is? Seems inconsistent.

As you may be able to tell, my own strategy is to replace disks as I go along with whatever I can get a decent deal on at the time. So the pool will always be the size of the smallest common denominator (still a bunch of 12TBs left). Or that was the idea anyway.

Had a quick look at the linked code, fairly scary stuff, the Python code writes zeroes direct onto the raw disk to “wipe potential conflicting ZFS label”. Better not get any of those offset calculations wrong… Just out of interest, couldn’t you call out to existing ZFS and partition manipulation tools to do this?

I have observed similar partition differences. The behavior regarding how partitions are used has changed over time. I am trying to hunt down documentation on it.

To note as well that in the most recent example (16TB drive added) it a) didn’t add a 2GB partition and b) used the whole drive. Which goes directly against the discussion above as to how this is supposed to work. So either both documentation and also this forum discussion is already outdated :wink: or, to complicate matters further, there are bugs in Scale that influence what actually happens in practice with a somewhat non deterministic outcome.

1 Like

Just as a follow-up, here’s the partition information on one of those 18 TB drives after the resilvering finished and I clicked “Expand” for the pool:

(parted) print free
Model: ATA OOS18000G (scsi)
Disk /dev/sdp: 18.0TB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags:

Number  Start   End     Size    File system  Name  Flags
        17.4kB  1049kB  1031kB  Free Space
 1      1049kB  18.0TB  18.0TB  zfs
        18.0TB  18.0TB  2146MB  Free Space

So it doesn’t have a swap partition, but it does have just over 2 GB free after the data partition.

1 Like

Ok, I did the same. And then realised/remembered it was an 18TB drive, not 16. :wink:

(parted) print free                                                       
Model: ATA TOSHIBA MG09ACA1 (scsi)
Disk /dev/sdc: 18.0TB
Sector size (logical/physical): 512B/4096B
Partition Table: gpt
Disk Flags: 

Number  Start   End     Size    File system  Name  Flags
        17.4kB  1049kB  1031kB  Free Space
 1      1049kB  18.0TB  18.0TB  zfs          data
        18.0TB  18.0TB  1032kB  Free Space

It follows a similar pattern to yours but not the same. Also it did this without me pressing “Expand” (hadn’t noticed such a button existed).

For me any operation on a vdev like expansion should NOT by automatic. Add a message or button to show that there is unused free space is the way. Of course if everybody disagree with me (for this) I will follow the herd. But there is situations when you don’t want to expend as soon as the drive is in place.

2 Likes