[Accepted] Create 2 GiB buffer space when adding a disk

The more I think about it, the more it upsets me.

Storage wants to use KB, MB, GB, and TB? Power of 10. Fine.

Yet this same storage is formatted with sector sizes of either 512 or 4096 bytes.

STORAGE MANUFACTURERS: WHY NOT MAKE THE SECTOR SIZES 500 and 4000 BYTES?

What a bunch of weasels…


EDIT:

Before someone "corrects" me about this...

That’s the joke. Storage media’s smallest write unit must adhere to binary measurements. This means that the drive manufacturers know it only makes sense to size their devices in binary measurements, yet they still choose to intentionally use decimal measurements for blatant marketing.

ChatGPTs impression of the Seagate marketing department back in the day.

The one in the foreground will be in a ton of pain and will require hip surgery … great work, AI.

1 Like

What I dislike is when someone uses “Class” to describe something that falls short of the real thing. For example: 65" Class TV (actual size is 64.12"), or 7200 RPM “Class” Hard Drive (rotational speed could be 5400 actual), 12TB “Class” Hard Drive (actual capacity 11.1 TB). The marketing world found another way to short change the consumer. Will it ever end.

Don’t kid around like that, someone may hear you.

1 Like

This is very unfair to mustelidae… (And we have a very helpful resident one here!)
Now, do you have an insight into WD marketing department when it decided first to sneak SMR drives into the Red line and then that “rpm” are not a technical parameter but just your impression?

(The prompt you used should be a good start, substituting “weasels” by “worshippers of Nyarlathotep, the Crawler in Chaos”.)

2 Likes

That is … quite specific. This is how ChatGPT imagines that.

2 Likes

This is the biggest issue. We have three groups of users:

  1. Old vdevs with 2GB partitions
  2. Newer vdevs without 2 GB partitions.
  3. Brand new vdevs… partitions are TBD

I think any software change has to be OK with all of them.

(or @winnielinnie has to pay me much more than $30)

@yorick improved the request, where it’s more flexible than “always use 2GB buffer”.

He explains it in his “edit” in the opening post. I illustrate it here.

Unfortunately, those who created pools/vdevs with versions of SCALE, after the change was made, are up a creek without a paddle. :confused:


You drive a hard bargain.

Fine. I’ll up my offer: A handsake, two emojis of your choice, and a weekly “thank you” message emailed to your inbox.

I don’t know how I can offer more than this.

2 Likes

It’s unfortunate ZFS can’t shrink pools. Would be easy solution then.

1 Like

Ticket opened on Jul 3, 2017.

Yeah, it’s not happening in our lifetime.

1 Like

Well, if iX decided it’s needed they could just throw engineer on that. They are one of ZFS contributors :slight_smile:
But I guess the original buffer solution is cheaper.

And it absolutely would be.

New vdev: Create partition 2 GiB smaller than max

Replacement drive or added drive on existing vdev: Ditto, but if that is then smaller than the smallest member, then increase partition size to match, if possible.

Old vdev with 2 GiB buffer (“swap”) partition: This works. Replacement drive may not have a buffer partition, but has a buffer.

Old vdev without buffer: No change to now. Slightly smaller drives don’t fit. Nothing to be done about that; the UX damage created by the change to remove the buffer can’t just be un-created.

New vdev: Works. ZFS partitions are created 2 GiB smaller, creating a buffer.

Any kind of vdev and adding a larger drive: Works. The larger drive gets a partition with a 2 GiB buffer; gradual replacement and vdev expansion that way continues to work.

It’s a pretty small change to the existing logic and it absolutely works with any vdev ever created in FreeNAS or TrueNAS, any edition.

4 Likes

What’s the etiquette around evolving feature requests? This started as “buffer partition” and then with discussion evolved to “buffer space”.

I’ve left the original subject line and post and added an Edit: because people may have voted for the original. While the refined design is the same in spirit, it’s not the same.

On the other hand I can see the argument to accurately reflect what the ask is - and it’s not to create a buffer partition, it’s to create a buffer.

Change subject line and original ask, or nay?

Well, people can also choose to remove their votes as a result of the discussion evolving. Can’t imagine that happening much.

I think it’d be better to have the subject be accurate

1 Like

Done. Also created explicit user stories for the types of existing or new vdevs and use cases I can think of.

What would you change further?

2 Likes

I think it’s perfect and it gets the message across.

It’s an important feature request with little cost (boo hoo, you “lose” 2GB from your 12TB drive), yet spares users from surprises in the future when they need to replace a drive or expand a vdev.

A rudimentary form of this already existed with FreeNAS, Core, and early versions of SCALE.

1 Like

I looked it up on my existing vdevs: TrueNAS CORE did 4194304 512 byte sectors, exactly 2 GiB; TrueNAS SCALE did 4194305 sectors, 2 GiB plus 512 bytes.

Adjusted the feature request to ask for 2 GiB and reference 2 GiB legacy “swap” partitions.

3 Likes

God yes. I want to suggest a minor adjustment, something easy that takes minimal engineering and testing time (especially since a version of this feature existed until 24.04.1).

Not a raidz-expansion-sized feature change to ZFS.

But its quite sad. I created my pools on Scale, so now I dont have buffer and I will be forever condemned to live in fear that my eventual replacement disk will be 1MB too small and will fail. :sob:

Yeah in your case there are no extremely good options. Put a sticky note somewhere to buy “one size up” when a drive fails, I guess :sweat_smile: .

If you replace 12TB with 14TB you’re golden, no matter what.

Once all vdev members have been replaced with larger drives, the new capacity becomes available.

You could also, though this ain’t fun:

  • Move all data off the pool and verify it’s fine
  • Destroy the pool
  • Boot into 23.10 from temp media
  • Create a fresh pool (this one will have swap partitions, your buffer partitions)
  • Export the pool
  • Remove temp media and boot back into production SCALE
  • Import the pool
  • Restore data to it
  • Depending on your backup/restore process, recreate permissions and ACLs … better though if you chose something that retains ZFS metadata including dataset properties, ACLs, etc. Such as a ZFS send/receive
1 Like

You could also, and this is fun:

  • Build a time machine
  • Travel back to May 2024
  • Spill coffee on the iX dev’s keyboard before they can commit the change
1 Like