RAIDZ1 pool, 3x 20TB, only 18 TB available, wrong capacities

b9chomps · November 15, 2024, 6:28pm

I’'m a complete beginner when it comes to building a NAS and TrueNAS.

I created a RAIDZ1 pool (2x 20 TB drives) via the CLI and imported it into the UI.
Then I extended it with another 20TB drive.
After the expansion the pool was degraded and my third drive available, so I replaced the unavailable drive in the pool.
Afterwards the usable capacity was 18,06 TB and the usage was almost 85%.
I ran the rebalancing script and ended up with used capacity of 11,6 TB and 6,46 TB available (64,2% usage).
The size of the data is actually shown as 15,5 TB (NFS share).

I’m at a bit of a loss what to do now. Is my pool set up correctly and the capacities are just incorrect?
Can I “fix” it in some way?

Some advice would be appreciated

Some info:

zpool status

  pool: Media
 state: ONLINE
  scan: resilvered 10.5T in 13:11:58 with 0 errors on Mon Nov 11 02:18:33 2024
expand: expanded raidz1-0 copied 31.4T in 5 days 23:41:18, on Fri Nov  8 22:53:02 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	Media                                     ONLINE       0     0     0
	  raidz1-0                                ONLINE       0     0     0
	    ata-TOSHIBA_MG10ACA20TE_5430A1JXF4MJ  ONLINE       0     0     0
	    ata-TOSHIBA_MG10ACA20TE_5430A1JMF4MJ  ONLINE       0     0     0
	    a5dcef47-d63c-4462-92cf-af0de65dc1c7  ONLINE       0     0     0

errors: No known data errors

zpool list

NAME        SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
Media      36.4T  23.2T  13.2T        -     18.2T     0%    63%  1.00x    ONLINE  /mnt

lsblk (one disk has no PTTYPE? sdb1 exists?)

NAME        MODEL                  PTTYPE TYPE    START           SIZE PARTTYPENAME             PARTUUID
sda         TOSHIBA MG10ACA20TE           disk          20000588955648                          
sdb         TOSHIBA MG10ACA20TE    gpt    disk          20000588955648                          
└─sdb1                             gpt    part     2048 20000586858496 Solaris /usr & Apple ZFS a5dcef47-d63c-4462-92cf-af0de65dc1c7
sde         TOSHIBA MG10ACA20TE           disk          20000588955648

Storage

Datasets

NFS Share size

Stux · November 15, 2024, 7:19pm

Protopia · November 15, 2024, 8:59pm

You have 3x 20TB disks in a RAIDZ1, so the total useable space should be c. 2x 20TB.

A 20TB drive is c. 20 x 10^12 bytes, but zpool list shows sizes in TiB (i.e. 2^40) which and 20 x 10^12 = c. 18.2 x 2^40.

So 2x 20TB = 2 x 18.2TiB = 36.4TiB and indeed this is exactly what the zpool list is showing. So ZFS Pool statistics based on actual block counts seem right.

The TrueNAS UI is showing dataset statistics, and it seems to be established that after an expand, these are shown wrong - and the iX Jira ticket referred to by @stux suggests that this is actually a ZFS mis-calculation and not a TrueNAS one. To be fair, because of compression and de-dup/block cloning, calculating actual space usage by ZFS file systems is way more difficult that calculating it by blocks. But I am really not sure that available space needs ti be based on what Datasets reports, but instead could be based on what the pool reports modified by any quotas set. My suggestions for iX would be as follows:

On the Storage summary screen which is showing space usage by pool, use the zpool statistics for the Usage widget rather than ZFS statistics.
Getting things right on the Dataset summary page is much more difficult. Although you can look up the actual space remaining in the pool through zpool, and although in the absence of any dataset quotas this should also be the available space shown, if you are trying to take into account quotas when showing the available space then you need to calculate this by subtracting the actual space used by the dataset from the quota, and taking the minimum of this and the available blocks in the pool. But I think it is doable, and would result in a more accurate (but still potentially inaccurate) estimate.

Edit: Jira ticket NAS-132559 created.

Edit: The zpool status and lsblk outputs also seem right for a pool created using a CLI and expanded using the UI. As I understand it, the UI always creates partitions rather than full disk usage, and always uses UIIDs, whilst the CLI doesn’t do either of this as standard. I don’t think that the current labels or full disk usage should create any problems, but others may have better knowledge that me on this.