tl;dr: I’m seeing 50-100x longer (a)syncq_wait read times for one drive in a 4 drive raidz2 vdev than the other 3 drives. It’s always just one drive, but it’s not the same drive all the time.
I have a raidz2 pool, r2d2, setup on 4 identical drives (Exos x16 14 TB). I ran a recursive hashing task (hashdeep) on the pool and saw pretty poor performance, 40MB/s from each drive. Given that the task is largely sequential reads I would have expected better throughput. I replicated the pool onto a single disk pool (also an Exos x16 14TB drive) and with the same task saw approximately 150MB/s speeds. Digging in deeper I ran zpool iostat -vyl 30 1
capacity operations bandwidth total_wait disk_wait syncq_wait asyncq_wait scrub trim rebuild
pool alloc free read write read write read write read write read write read write wait wait wait
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
r2d2 18.9T 32.0T 2.64K 0 167M 0 311ms - 4ms - 1s - 255ms - - - -
raidz2-0 18.9T 32.0T 2.64K 0 167M 0 311ms - 4ms - 1s - 255ms - - - -
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxbe1a - - 691 0 41.7M 0 26ms - 2ms - - - 23ms - - - -
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxe276 - - 699 0 41.8M 0 24ms - 2ms - - - 21ms - - - -
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxx32f5 - - 701 0 41.8M 0 27ms - 2ms - - - 24ms - - - -
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxx2b14 - - 612 0 42.0M 0 1s - 9ms - 1s - 1s - - - -
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
r2d2-clone 9.17T 3.55T 169 0 162M 0 2s - 81ms - 2s - 3s - - - -
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxad21 9.17T 3.55T 169 0 162M 0 2s - 81ms - 2s - 3s - - - -
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
As you can see above, one member of the raidz2-0 vdev has (a)syncq_wait times that are 50x longer than the other 3 drives. My first thought was that maybe the drive was failing. But when I checked zpool iostat
again sometime later, it was a different member that was exhibiting the longer wait times. I then monitored the output and saw that there is always one member of the vdev that is exhibiting these extended (a)syncq_wait times and that member changes over time.
Below is zpool iostat -vyw 30 1
output. Similar story to the output above, 3 members seem to have q_wait times that are between 1-500ms, and one member where they are 500ms-2s.This was run a few minutes after the output above and you can see that it’s now a different drive (32f5 vs 2b14) that is exhibiting the longer wait times.
r2d2 total_wait disk_wait syncq_wait asyncq_wait
latency read write read write read write read write scrub trim rebuild
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1ns 0 0 0 0 0 0 0 0 0 0 0
3ns 0 0 0 0 0 0 0 0 0 0 0
7ns 0 0 0 0 0 0 0 0 0 0 0
15ns 0 0 0 0 0 0 0 0 0 0 0
31ns 0 0 0 0 0 0 0 0 0 0 0
63ns 0 0 0 0 0 0 0 0 0 0 0
127ns 0 0 0 0 0 0 0 0 0 0 0
255ns 0 0 0 0 0 0 104 0 0 0 0
511ns 0 0 0 0 0 0 483 0 0 0 0
1us 0 0 0 0 0 0 132 0 0 0 0
2us 0 0 0 0 0 0 2 0 0 0 0
4us 0 0 0 0 0 0 0 0 0 0 0
8us 0 0 0 0 0 0 0 0 0 0 0
16us 0 0 0 0 0 0 1 0 0 0 0
32us 0 0 0 0 0 0 1 0 0 0 0
65us 2 0 2 0 0 0 3 0 0 0 0
131us 134 0 137 0 0 0 6 0 0 0 0
262us 406 0 469 0 0 0 9 0 0 0 0
524us 61 0 352 0 0 0 16 0 0 0 0
1ms 28 0 190 0 0 0 25 0 0 0 0
2ms 38 0 1.05K 0 0 0 46 0 0 0 0
4ms 81 0 324 0 0 0 78 0 0 0 0
8ms 151 0 68 0 0 0 140 0 0 0 0
16ms 275 0 196 0 0 0 232 0 0 0 0
33ms 346 0 131 0 0 0 295 0 0 0 0
67ms 329 0 33 0 0 0 301 0 0 0 0
134ms 242 0 5 0 0 0 225 0 0 0 0
268ms 191 0 1 0 0 0 187 0 0 0 0
536ms 139 0 0 0 4 0 134 0 0 0 0
1s 252 0 0 0 63 0 194 0 0 0 0
2s 308 0 0 0 72 0 229 0 0 0 0
4s 0 0 0 0 0 0 0 0 0 0 0
8s 0 0 0 0 0 0 0 0 0 0 0
17s 0 0 0 0 0 0 0 0 0 0 0
34s 0 0 0 0 0 0 0 0 0 0 0
68s 0 0 0 0 0 0 0 0 0 0 0
137s 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------------------------------------------------------------
raidz2-0 total_wait disk_wait syncq_wait asyncq_wait
latency read write read write read write read write scrub trim rebuild
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1ns 0 0 0 0 0 0 0 0 0 0 0
3ns 0 0 0 0 0 0 0 0 0 0 0
7ns 0 0 0 0 0 0 0 0 0 0 0
15ns 0 0 0 0 0 0 0 0 0 0 0
31ns 0 0 0 0 0 0 0 0 0 0 0
63ns 0 0 0 0 0 0 0 0 0 0 0
127ns 0 0 0 0 0 0 0 0 0 0 0
255ns 0 0 0 0 0 0 104 0 0 0 0
511ns 0 0 0 0 0 0 483 0 0 0 0
1us 0 0 0 0 0 0 132 0 0 0 0
2us 0 0 0 0 0 0 2 0 0 0 0
4us 0 0 0 0 0 0 0 0 0 0 0
8us 0 0 0 0 0 0 0 0 0 0 0
16us 0 0 0 0 0 0 1 0 0 0 0
32us 0 0 0 0 0 0 1 0 0 0 0
65us 2 0 2 0 0 0 3 0 0 0 0
131us 134 0 137 0 0 0 6 0 0 0 0
262us 406 0 469 0 0 0 9 0 0 0 0
524us 61 0 352 0 0 0 16 0 0 0 0
1ms 28 0 190 0 0 0 25 0 0 0 0
2ms 38 0 1.05K 0 0 0 46 0 0 0 0
4ms 81 0 324 0 0 0 78 0 0 0 0
8ms 151 0 68 0 0 0 140 0 0 0 0
16ms 275 0 196 0 0 0 232 0 0 0 0
33ms 346 0 131 0 0 0 295 0 0 0 0
67ms 329 0 33 0 0 0 301 0 0 0 0
134ms 242 0 5 0 0 0 225 0 0 0 0
268ms 191 0 1 0 0 0 187 0 0 0 0
536ms 139 0 0 0 4 0 134 0 0 0 0
1s 252 0 0 0 63 0 194 0 0 0 0
2s 308 0 0 0 72 0 229 0 0 0 0
4s 0 0 0 0 0 0 0 0 0 0 0
8s 0 0 0 0 0 0 0 0 0 0 0
17s 0 0 0 0 0 0 0 0 0 0 0
34s 0 0 0 0 0 0 0 0 0 0 0
68s 0 0 0 0 0 0 0 0 0 0 0
137s 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------------------------------------------------------------
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxbe1a total_wait disk_wait syncq_wait asyncq_wait
latency read write read write read write read write scrub trim rebuild
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1ns 0 0 0 0 0 0 0 0 0 0 0
3ns 0 0 0 0 0 0 0 0 0 0 0
7ns 0 0 0 0 0 0 0 0 0 0 0
15ns 0 0 0 0 0 0 0 0 0 0 0
31ns 0 0 0 0 0 0 0 0 0 0 0
63ns 0 0 0 0 0 0 0 0 0 0 0
127ns 0 0 0 0 0 0 0 0 0 0 0
255ns 0 0 0 0 0 0 44 0 0 0 0
511ns 0 0 0 0 0 0 160 0 0 0 0
1us 0 0 0 0 0 0 47 0 0 0 0
2us 0 0 0 0 0 0 0 0 0 0 0
4us 0 0 0 0 0 0 0 0 0 0 0
8us 0 0 0 0 0 0 0 0 0 0 0
16us 0 0 0 0 0 0 0 0 0 0 0
32us 0 0 0 0 0 0 0 0 0 0 0
65us 0 0 1 0 0 0 1 0 0 0 0
131us 47 0 47 0 0 0 2 0 0 0 0
262us 142 0 150 0 0 0 3 0 0 0 0
524us 22 0 102 0 0 0 5 0 0 0 0
1ms 9 0 46 0 0 0 8 0 0 0 0
2ms 13 0 263 0 0 0 14 0 0 0 0
4ms 26 0 79 0 0 0 27 0 0 0 0
8ms 51 0 15 0 0 0 46 0 0 0 0
16ms 92 0 44 0 0 0 75 0 0 0 0
33ms 110 0 25 0 0 0 94 0 0 0 0
67ms 106 0 1 0 0 0 99 0 0 0 0
134ms 82 0 0 0 0 0 76 0 0 0 0
268ms 54 0 0 0 0 0 53 0 0 0 0
536ms 17 0 0 0 0 0 16 0 0 0 0
1s 0 0 0 0 0 0 0 0 0 0 0
2s 0 0 0 0 0 0 0 0 0 0 0
4s 0 0 0 0 0 0 0 0 0 0 0
8s 0 0 0 0 0 0 0 0 0 0 0
17s 0 0 0 0 0 0 0 0 0 0 0
34s 0 0 0 0 0 0 0 0 0 0 0
68s 0 0 0 0 0 0 0 0 0 0 0
137s 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------------------------------------------------------------
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxe276 total_wait disk_wait syncq_wait asyncq_wait
latency read write read write read write read write scrub trim rebuild
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1ns 0 0 0 0 0 0 0 0 0 0 0
3ns 0 0 0 0 0 0 0 0 0 0 0
7ns 0 0 0 0 0 0 0 0 0 0 0
15ns 0 0 0 0 0 0 0 0 0 0 0
31ns 0 0 0 0 0 0 0 0 0 0 0
63ns 0 0 0 0 0 0 0 0 0 0 0
127ns 0 0 0 0 0 0 0 0 0 0 0
255ns 0 0 0 0 0 0 9 0 0 0 0
511ns 0 0 0 0 0 0 195 0 0 0 0
1us 0 0 0 0 0 0 50 0 0 0 0
2us 0 0 0 0 0 0 1 0 0 0 0
4us 0 0 0 0 0 0 0 0 0 0 0
8us 0 0 0 0 0 0 0 0 0 0 0
16us 0 0 0 0 0 0 0 0 0 0 0
32us 0 0 0 0 0 0 0 0 0 0 0
65us 1 0 1 0 0 0 1 0 0 0 0
131us 48 0 49 0 0 0 2 0 0 0 0
262us 143 0 153 0 0 0 3 0 0 0 0
524us 22 0 103 0 0 0 5 0 0 0 0
1ms 9 0 50 0 0 0 8 0 0 0 0
2ms 12 0 259 0 0 0 15 0 0 0 0
4ms 29 0 76 0 0 0 25 0 0 0 0
8ms 50 0 15 0 0 0 46 0 0 0 0
16ms 93 0 44 0 0 0 80 0 0 0 0
33ms 116 0 26 0 0 0 97 0 0 0 0
67ms 102 0 1 0 0 0 92 0 0 0 0
134ms 75 0 0 0 0 0 69 0 0 0 0
268ms 58 0 0 0 0 0 57 0 0 0 0
536ms 18 0 0 0 0 0 17 0 0 0 0
1s 0 0 0 0 0 0 0 0 0 0 0
2s 0 0 0 0 0 0 0 0 0 0 0
4s 0 0 0 0 0 0 0 0 0 0 0
8s 0 0 0 0 0 0 0 0 0 0 0
17s 0 0 0 0 0 0 0 0 0 0 0
34s 0 0 0 0 0 0 0 0 0 0 0
68s 0 0 0 0 0 0 0 0 0 0 0
137s 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------------------------------------------------------------
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxx32f5 total_wait disk_wait syncq_wait asyncq_wait
latency read write read write read write read write scrub trim rebuild
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1ns 0 0 0 0 0 0 0 0 0 0 0
3ns 0 0 0 0 0 0 0 0 0 0 0
7ns 0 0 0 0 0 0 0 0 0 0 0
15ns 0 0 0 0 0 0 0 0 0 0 0
31ns 0 0 0 0 0 0 0 0 0 0 0
63ns 0 0 0 0 0 0 0 0 0 0 0
127ns 0 0 0 0 0 0 0 0 0 0 0
255ns 0 0 0 0 0 0 0 0 0 0 0
511ns 0 0 0 0 0 0 0 0 0 0 0
1us 0 0 0 0 0 0 0 0 0 0 0
2us 0 0 0 0 0 0 0 0 0 0 0
4us 0 0 0 0 0 0 0 0 0 0 0
8us 0 0 0 0 0 0 0 0 0 0 0
16us 0 0 0 0 0 0 0 0 0 0 0
32us 0 0 0 0 0 0 0 0 0 0 0
65us 0 0 0 0 0 0 0 0 0 0 0
131us 0 0 0 0 0 0 0 0 0 0 0
262us 0 0 36 0 0 0 0 0 0 0 0
524us 0 0 45 0 0 0 0 0 0 0 0
1ms 0 0 47 0 0 0 0 0 0 0 0
2ms 0 0 287 0 0 0 0 0 0 0 0
4ms 0 0 85 0 0 0 0 0 0 0 0
8ms 0 0 22 0 0 0 0 0 0 0 0
16ms 0 0 65 0 0 0 0 0 0 0 0
33ms 1 0 50 0 0 0 0 0 0 0 0
67ms 2 0 27 0 0 0 2 0 0 0 0
134ms 5 0 5 0 0 0 5 0 0 0 0
268ms 23 0 1 0 0 0 23 0 0 0 0
536ms 81 0 0 0 4 0 78 0 0 0 0
1s 252 0 0 0 63 0 194 0 0 0 0
2s 308 0 0 0 72 0 229 0 0 0 0
4s 0 0 0 0 0 0 0 0 0 0 0
8s 0 0 0 0 0 0 0 0 0 0 0
17s 0 0 0 0 0 0 0 0 0 0 0
34s 0 0 0 0 0 0 0 0 0 0 0
68s 0 0 0 0 0 0 0 0 0 0 0
137s 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------------------------------------------------------------
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxx2b14 total_wait disk_wait syncq_wait asyncq_wait
latency read write read write read write read write scrub trim rebuild
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1ns 0 0 0 0 0 0 0 0 0 0 0
3ns 0 0 0 0 0 0 0 0 0 0 0
7ns 0 0 0 0 0 0 0 0 0 0 0
15ns 0 0 0 0 0 0 0 0 0 0 0
31ns 0 0 0 0 0 0 0 0 0 0 0
63ns 0 0 0 0 0 0 0 0 0 0 0
127ns 0 0 0 0 0 0 0 0 0 0 0
255ns 0 0 0 0 0 0 51 0 0 0 0
511ns 0 0 0 0 0 0 127 0 0 0 0
1us 0 0 0 0 0 0 34 0 0 0 0
2us 0 0 0 0 0 0 0 0 0 0 0
4us 0 0 0 0 0 0 0 0 0 0 0
8us 0 0 0 0 0 0 0 0 0 0 0
16us 0 0 0 0 0 0 0 0 0 0 0
32us 0 0 0 0 0 0 0 0 0 0 0
65us 0 0 0 0 0 0 0 0 0 0 0
131us 39 0 39 0 0 0 1 0 0 0 0
262us 120 0 129 0 0 0 3 0 0 0 0
524us 17 0 100 0 0 0 5 0 0 0 0
1ms 9 0 45 0 0 0 8 0 0 0 0
2ms 11 0 268 0 0 0 15 0 0 0 0
4ms 25 0 83 0 0 0 25 0 0 0 0
8ms 48 0 15 0 0 0 46 0 0 0 0
16ms 88 0 41 0 0 0 76 0 0 0 0
33ms 117 0 29 0 0 0 102 0 0 0 0
67ms 118 0 1 0 0 0 106 0 0 0 0
134ms 78 0 0 0 0 0 73 0 0 0 0
268ms 55 0 0 0 0 0 53 0 0 0 0
536ms 22 0 0 0 0 0 22 0 0 0 0
1s 0 0 0 0 0 0 0 0 0 0 0
2s 0 0 0 0 0 0 0 0 0 0 0
4s 0 0 0 0 0 0 0 0 0 0 0
8s 0 0 0 0 0 0 0 0 0 0 0
17s 0 0 0 0 0 0 0 0 0 0 0
34s 0 0 0 0 0 0 0 0 0 0 0
68s 0 0 0 0 0 0 0 0 0 0 0
137s 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------------------------------------------------------------
r2d2-clone total_wait disk_wait syncq_wait asyncq_wait
latency read write read write read write read write scrub trim rebuild
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1ns 0 0 0 0 0 0 0 0 0 0 0
3ns 0 0 0 0 0 0 0 0 0 0 0
7ns 0 0 0 0 0 0 0 0 0 0 0
15ns 0 0 0 0 0 0 0 0 0 0 0
31ns 0 0 0 0 0 0 0 0 0 0 0
63ns 0 0 0 0 0 0 0 0 0 0 0
127ns 0 0 0 0 0 0 0 0 0 0 0
255ns 0 0 0 0 0 0 0 0 0 0 0
511ns 0 0 0 0 0 0 0 0 0 0 0
1us 0 0 0 0 0 0 0 0 0 0 0
2us 0 0 0 0 0 0 0 0 0 0 0
4us 0 0 0 0 0 0 0 0 0 0 0
8us 0 0 0 0 0 0 0 0 0 0 0
16us 0 0 0 0 0 0 0 0 0 0 0
32us 0 0 0 0 0 0 0 0 0 0 0
65us 0 0 0 0 0 0 0 0 0 0 0
131us 0 0 0 0 0 0 0 0 0 0 0
262us 0 0 0 0 0 0 0 0 0 0 0
524us 0 0 0 0 0 0 0 0 0 0 0
1ms 0 0 0 0 0 0 0 0 0 0 0
2ms 0 0 22 0 0 0 0 0 0 0 0
4ms 0 0 9 0 0 0 0 0 0 0 0
8ms 0 0 11 0 0 0 0 0 0 0 0
16ms 0 0 38 0 0 0 0 0 0 0 0
33ms 0 0 8 0 0 0 0 0 0 0 0
67ms 0 0 14 0 0 0 0 0 0 0 0
134ms 0 0 19 0 0 0 0 0 0 0 0
268ms 1 0 29 0 1 0 0 0 0 0 0
536ms 4 0 9 0 3 0 0 0 0 0 0
1s 6 0 0 0 7 0 0 0 0 0 0
2s 15 0 0 0 15 0 0 0 0 0 0
4s 120 0 0 0 65 0 56 0 0 0 0
8s 14 0 0 0 3 0 7 0 0 0 0
17s 0 0 0 0 0 0 0 0 0 0 0
34s 0 0 0 0 0 0 0 0 0 0 0
68s 0 0 0 0 0 0 0 0 0 0 0
137s 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------------------------------------------------------------
xxxxxxxx-xxxx-xxxx-xxxxxxxxxxxxxad21 total_wait disk_wait syncq_wait asyncq_wait
latency read write read write read write read write scrub trim rebuild
---------------------------------------- ----- ----- ----- ----- ----- ----- ----- ----- ----- ----- -----
1ns 0 0 0 0 0 0 0 0 0 0 0
3ns 0 0 0 0 0 0 0 0 0 0 0
7ns 0 0 0 0 0 0 0 0 0 0 0
15ns 0 0 0 0 0 0 0 0 0 0 0
31ns 0 0 0 0 0 0 0 0 0 0 0
63ns 0 0 0 0 0 0 0 0 0 0 0
127ns 0 0 0 0 0 0 0 0 0 0 0
255ns 0 0 0 0 0 0 0 0 0 0 0
511ns 0 0 0 0 0 0 0 0 0 0 0
1us 0 0 0 0 0 0 0 0 0 0 0
2us 0 0 0 0 0 0 0 0 0 0 0
4us 0 0 0 0 0 0 0 0 0 0 0
8us 0 0 0 0 0 0 0 0 0 0 0
16us 0 0 0 0 0 0 0 0 0 0 0
32us 0 0 0 0 0 0 0 0 0 0 0
65us 0 0 0 0 0 0 0 0 0 0 0
131us 0 0 0 0 0 0 0 0 0 0 0
262us 0 0 0 0 0 0 0 0 0 0 0
524us 0 0 0 0 0 0 0 0 0 0 0
1ms 0 0 0 0 0 0 0 0 0 0 0
2ms 0 0 22 0 0 0 0 0 0 0 0
4ms 0 0 9 0 0 0 0 0 0 0 0
8ms 0 0 11 0 0 0 0 0 0 0 0
16ms 0 0 38 0 0 0 0 0 0 0 0
33ms 0 0 8 0 0 0 0 0 0 0 0
67ms 0 0 14 0 0 0 0 0 0 0 0
134ms 0 0 19 0 0 0 0 0 0 0 0
268ms 1 0 29 0 1 0 0 0 0 0 0
536ms 4 0 9 0 3 0 0 0 0 0 0
1s 6 0 0 0 7 0 0 0 0 0 0
2s 15 0 0 0 15 0 0 0 0 0 0
4s 120 0 0 0 65 0 56 0 0 0 0
8s 14 0 0 0 3 0 7 0 0 0 0
17s 0 0 0 0 0 0 0 0 0 0 0
34s 0 0 0 0 0 0 0 0 0 0 0
68s 0 0 0 0 0 0 0 0 0 0 0
137s 0 0 0 0 0 0 0 0 0 0 0
---------------------------------------------------------------------------------------------------------------------