RAIDZ expansion speedup

I am expanding a RAIDZ2 pool from 4 to 5 disks on TrueNAS 24.10.0.2.

The expansion speed was ~55 M/s but I got it to ~110 M/s by increasing the raidz_expand_max_copy_bytes parameter:

admin@truenas ~ $ cat /sys/module/zfs/parameters/raidz_expand_max_copy_bytes
167772160
admin@truenas ~ $ echo $((10 * 167772160)) | sudo tee /sys/module/zfs/parameters/raidz_expand_max_copy_bytes
1677721600
8 Likes

1 Like
/*
 * Maximum amount of copy io's outstanding at once.
 */
static unsigned long raidz_expand_max_copy_bytes = 10 * SPA_MAXBLOCKSIZE;

Effectively 10 * 16MiB → 100 * 16MiB.

I’d be interested if you see the same gain just by doubling.

It doesn’t seem so… I have briefly tested x2, x4, x8 :

1 Like

This is how it should be: Optimize RAIDZ expansion by amotin · Pull Request #16819 · openzfs/zfs · GitHub .

6 Likes

@mav do you know if this will be in Fangtooth? Or even earlier?

It will probably depend on OpenZFS… we prefer to use a standard version if possible.

2 Likes

I notice its been pushed to master now

2 Likes

Hello. So the value I have currently is:

root@truenas[~]# cat /sys/module/zfs/parameters/raidz_expand_max_copy_bytes
167772160 - which is the same as OP, 160MB.

OP increased it to 1,6 gigs and claims process got faster.

Can it be done mid-expansion?

If not, can the expansion be paused, the setting adjusted and applied, and expansion then resumed with the new increased parameter size?

It was explained to me thus:

It’s not exactly that ZFS polls for config updates, its more that you’re directly adjusting one of its tunable settings by poking the kernel module.

(/sys is a magical and auto-generated interface to system stuff pretending to be files and folders.)

So the zfs kernel parameters can be adjusted on the fly without interrupting the expansion process. It took a minute or two to see any changes and then disk speeds increased noticeably with the 1,6GB cache.

Now the process should be complete tomorrow instead of next Monday.

3 Likes

It did work for about 5 hours but then the expansion slowed down to the original speed. I tried *16 instead of *10, faster a few seconds then back to slow. As I have 256GB of RAM lets try *384 and it did work. The datasets in the vdev are media files, some with large video files, some with a lot of small images. Can it explain the speed drop?
Screenshot of netdata:

So I thought this didn’t do anything, put it in place mid-expansion (with 3 drives queued up after it).

That mid-expansion drive finished…and holy cow it’s blowing through the other 3.

expand: expansion of raidz2-0 in progress since Mon Jan  6 17:17:07 2025
        40.2T / 50.6T copied at 510M/s, 79.49% done, 05:55:37 to go

It’s nearly done 3 8TB drives in less time than it was taking to do 1.