Creating a pool / without formatting one drive (zfs formatted)

b9chomps · August 29, 2024, 3:34pm

Greetings, I’m brand new not only to the forum, but NAS in general.

The final parts for my first NAS are on their way and while I wait I’m trying to figure out how to start, when everything arrives.

My current situation:
This far I stuck to data drives in my desktop PC. At the moment my media is on a 20TB drive (~15TB in use, zfs)

The goal:
Two additional 20TB drives are on the way and I want to go for RaidZ1, resulting in 40TB of usable storage with 1 parity drive

The question:
How can I achieve my goal?

Create a pool with 2 drives, copy the data and add the 3rd drive?
In my search someone mentioned the idea of creating a degraded pool and add the third drive with data to it? TrueNAS Forum post

As I mentioned I’m very new to NAS, TrueNAS and zfs and would appreciate some help.

Thanks for reading!

Protopia · August 29, 2024, 4:24pm

You can’t do this - the minimum number of drives (or files) for a RAIDZ1 is 3.

The concept is right - you create a pool using 2 physical drives and a sparse file acting as a 3rd pseudo-drive. Then you delete the sparse file making the pool degraded. If you do a zpool status you will see that the pool is a degraded RAIDZ1 - which means that when you have a 3rd drive available, then you can replace the broken pseudo-drive with a real physical drive and let it resilver, and then you will have a proper RAIDZ1 pool.

I have NOT done this myself, so wait until someone else can verify these instructions, but I think you need to:

sudo -i
cd /tmp
truncate -s 19T sparsefile
zpool create new-pool -o ashift=12 raidz1 /dev/sd1 /dev/sd2 /tmp/sparsefile
zpool offline new-pool /tmp/sparsefile
rm /tmp/sparsefile
zpool status -v
zpool export new-pool
exit

Now you can go into the UI and:

Import the new-pool
Use replication to copy the old-pool to the new-pool (do not set read-only).
Once you are happy that all the data is on the new pool, you can switch over your SMB shares etc. to the new location.
When you are completely happy that your data is in the new-pool and that you have access to it, destroy the old-pool.
Use the now spare disk to Replace the offline pseudo-disk in the new-pool and resilver the redundancy data to it.
Once the resilver is complete you should be done.
I believe that pools are created as auto-expand, so once the resilver has completed the pool should auto-expand to fill the drives, but if not we can help you achieve this through the UI or from a shell.

As I said, this is my guess based on research and experience - I have not done this myself, so get a 2nd opinion.

dan · August 29, 2024, 4:25pm

We wouldn’t generally recommend this; the risk of data loss on a resilver is significant. But if you want to do this, currently, this is the only way to do it:

Alternatively, wait for the release of 24.10, which will include both RAIDZ expansion and the ability to create a two-disk RAIDZ1 vdev.

dan · August 29, 2024, 4:27pm

Partition the disks and use the partuuids rather than /dev/sda and such. Other than that, I think you’ve got it.

Protopia · August 29, 2024, 4:31pm

This is a good point. During a resilver ALL the drives in the vDev are put under heavy workload to do the resilver, increasing the risk of another drive failing during the resilver process, and the longer the resilver takes the greater the risk of failure during the process. The length of time for a resilver depends on the number of drives in the vDev and the size of those drives (or the amount of data in the pool), and even for 3x wide vDevs if you have large drives (and 20TB drives are LARGE) then RAIDZ2 is recommended.

Protopia · August 29, 2024, 4:32pm

Why? (Genuine question…)

dan · August 29, 2024, 4:34pm

First, for consistency with the way the TrueNAS middleware creates pools. Second, because particularly in Linux, drive identifiers change a fair bit, and if sdc is listed as a pool member, and it becomes sdd, that can confuse ZFS. Partition UUIDs are the Linux flavor of gptids and avoid that problem, because they’re static and unique for a given partition.

Protopia · August 29, 2024, 4:50pm

I agree with the principle to use UUIDs.

Can you use disk UUIDs which you can get from ls -l /dev/disk/by-partuuid?

dan · August 29, 2024, 4:50pm

What you get there are partition UUIDs, and that’s what you should use.

Disk UUIDs are likely going to have the same effect, but partition UUIDs are what TrueNAS has been using since pretty much forever.

Protopia · August 29, 2024, 4:56pm

D’oh. These are the partition ids.

And I can’t get disk UUIDs using e.g. lsblk or ls -l /dev/disk/by-x.

So yes, you need to create GPT ZFS partitions on the new drives as @Dan says, and when one of us works out how to do this, they can repost the above instructions amended as appropriate.

dan · August 29, 2024, 8:59pm

I don’t have a test environment handy at the moment, but adapting from Manual disk replacement in TrueNAS SCALE | Dan's Wiki, I believe it would be parted /dev/sda mklabel gpt followed by parted /dev/sda mkpart zfs 1 100%. You can then get the partition UUID by running lsblk -o NAME,SIZE,PARTUUID.

etorix · August 29, 2024, 9:23pm

Or create a pool with three sparse files and then replace two with real disks from the GUI, letting the middleware deal with partitions and labels?

Protopia · August 29, 2024, 9:30pm

What happens when you have a partition of 20TB and a sparse file which is smaller. Can ZFS create a file system that uses only part of a partition? (I can’t see any technical reason why not. In Linux to grow a file system, you first grow the partition - and the existing file system runs just fine - and then you grow the file system as a separate step.

Actually, this might indeed be a pretty good idea. It does seem best to use the UI to create vDevs exactly as you might wish, and this would be the way to do so.

I might have to create a VM with TN and try it.

dan · August 29, 2024, 10:22pm

RAIDZ vdevs are based on the size of the smallest device, so this isn’t a problem. It’d be the same as if you created a vdev with two 6 TB disks and one 4 TB disk–you’d get 8 TB capacity (less overhead etc.), and if you replace the 4 TB disk with a 6 TB one, it expands to 12 TB.

Interesting thought. Would the GUI even let you import such a pool, though? Let’s try. I can create the pool:

  pool: testpool
 state: ONLINE
config:

	NAME                           STATE     READ WRITE CKSUM
	testpool                       ONLINE       0     0     0
	  raidz1-0                     ONLINE       0     0     0
	    /mnt/software/sparsefile1  ONLINE       0     0     0
	    /mnt/software/sparsefile2  ONLINE       0     0     0
	    /mnt/software/sparsefile3  ONLINE       0     0     0

errors: No known data errors

But the pool doesn’t show in the GUI:

If I export it, it doesn’t come up if I run zpool import, and it doesn’t come up if I click “Import Pool” in the GUI either.

Protopia · August 29, 2024, 11:07pm

So what happened to it when you exported it?

Does it still exist at all?

dan · August 29, 2024, 11:23pm

I’ve since deleted the sparsefiles, but I’d expect I could still have imported it from the CLI using the -d flag to tell it to look for the pool devices in the directory where I created the sparsefiles. Of course, the GUI doesn’t have any way to deal with that (nor should it, really).

Edit: yep:

root@nas[/mnt/software]# truncate -s 4T sparsefile1
root@nas[/mnt/software]# truncate -s 4T sparsefile2
root@nas[/mnt/software]# truncate -s 4T sparsefile3
root@nas[/mnt/software]# zpool create -o altroot=/mnt testpool raidz1 /mnt/software/sparsefile1 /mnt/software/sparsefile2 /mnt/software/sparsefile3
root@nas[/mnt/software]# zpool export testpool
root@nas[/mnt/software]# zpool import -d /mnt/software testpool
cannot mount '/testpool': failed to create mountpoint: Read-only file system
Import was successful, but unable to mount some datasets
root@nas[/mnt/software]# zpool export testpool
root@nas[/mnt/software]# zpool import -d /mnt/software -o altroot=/mnt testpool
root@nas[/mnt/software]# zpool status testpool
  pool: testpool
 state: ONLINE
config:

	NAME                           STATE     READ WRITE CKSUM
	testpool                       ONLINE       0     0     0
	  raidz1-0                     ONLINE       0     0     0
	    /mnt/software/sparsefile1  ONLINE       0     0     0
	    /mnt/software/sparsefile2  ONLINE       0     0     0
	    /mnt/software/sparsefile3  ONLINE       0     0     0

errors: No known data errors