I use TrueNAS Scale ElectricEel-24.10.2.1. My machine has 8 drive bays and now I am only using 4 of them with a RAID Z1 setup.
Recently the storage pool used space reached 80%. My original plan was to use the RAID Z expansion feature to add 1 more drive and use a 5 drive RAID Z1 setup.
However, since I have left half of the drive bays empty, I figured this might be a chance to upgrade to RAID Z2.
I know that there is no way to simply “upgrade” from RAID Z1 to RAID Z2. So I have to manually migrate the data. But I don’t know the best way to do it.
My thought is offline one drive from the RAID Z1, backup the data just in case anything goes wrong. Then create a RAID Z2 setup with that drive and 4 newly bought drives.
After that I’ll copy the data in the degraded RAID Z1 pool to the new RAID Z2 pool. Maybe zfs send can acheive this?
When data migration is done, destroy the old RAID Z1 pool and add the 3 drives to RAID Z2 using RAID Z expansion.
Most of my data is redownloadable media file. Important personal file has backup elsewhere. But if I lost these data it might be very difficult to get them again because most of the download links are dead. So I consider this migration process worthy, even though it might take a week or more given that there is about 20TB data.
I don’t have other storage that is large enough to use as temp to make this migration smoother. So I am asking for your advice. Are there any better ways to do this?
Afaik and as you already pointed, you need to recreate the pool.
Replication with full file system replication and recursive is what I used at time when i performed this operation (from 3x raidz1 to 6x raidz2); also to mention Is you want that the new pool will take the name of the old one, to inherit every possible settings, so care to not delete settings when exporting the old pool.
To avoid the use of degraded pool, and due the fact i literally had all SATA occupied
i have replicate the data over a spare disk via USB that could contains everything
exported and recreate the pool
replicate again into the new pool
This was possible because my data was under 2tb Total… But to avoid USB, and if you need more space, using another machine Is a good option too: remote replication of 2 system in the same LAN Is not so different to setup compared to a local replica, and if you have more spare disks to use you can also build a more reliable backup temp pool than the stripe one i used.
Yes, but the easiest way to set up a replication task in the GUI and run it manually.
As described you do not have enough storage for a full backup or for a direct move to the new pool in its final geometry, so some hacking and/or risk-taking will be involved.
You current plan involves a degraded raidz1 without redundancy; any incident at this point would lose some data. I may suggest to create a 5-wide raidz2 with the 4 new drives and a sparse file, offline the sparse file, replicate from the old raidz1 to the degraded raidz2 (single redundancy all along), destroy the old pool and then reuse its drives for one resilver and three expansions to end up with an 8-wide raidz2. All local, but it involves the CLI.
Degrade your existing RAIDZ1 pool - my recommendation is NOT to move to an interim non-redundant configuration because you risk complete data loss if a drive fails during the migration and also risk individual files not being correctable if you have a checksum error.
Get your stored data down to c. 13TB by removing checkpoints, snapshots and if necessary offloading data or deleting re-downloadable data so that it will fit onto a new 4x RAIDZ2 pool.
Create a degraded 5x RAIDZ2 pool using 4x disks (which is effectively a 4x RAIDZ1 like your existing pool) and then migrate to this - both old and new pools have single redundancy. Then when you destroy your old pool to add the disks to the new pool, the first action is to restore the second redundancy using resilvering.
The disadvantage is that you will need to create the new pool manually with a command shell using 4x new drives and a sparse file (and then degrade it by offlining and deleting the sparse file) and you will want to do this in a way that mimics how TrueNAS will do it i.e. with partitions and buffer space and partuuids (which means working out your partition sizes, manually creating partitions and partuuids, creating a sparse file on the existing pool, and then manually creating the pool with the 4x partids - /dev/disk/by-partuuid/XXXXXX - and the sparse file). Assuming you don’t screw over the existing pool, if you get it wrong then trying this multiple times shouldn’t be an issue.
If you decide to go down this route, then deciding what your commands should be and running them past the community here might be a good way to get some QA.
Aside: Wouldn’t it be great if someone wrote a script which did this degraded RAIDZ pool creation in a completely safe way?
Migration
Since you are going to migrate your data anyway, remove any checkpoint and snapshots on your current pool in order to reduce the space you are using.
Before you migrate your data you should manually run a scrub to verify the checksums. This is especially important if you decide still to move to a degraded RAIDZ1 as this will be your last chance to correct them before you destroy the redundancy and remove the ability to do so.
As @oxyde has said, TrueNAS replication (which is zfs send) is the best means of copying data from one pool to another. Depending on whether to degrade or delete, you will either be writing 20TB to a 5xRAIDZ2 or writing c. 13TB to a 4xRAIDZ2 and either of these will take c. 10 hours to replicate internally. So not an excessive amount of time.
Expansion
Since you will be doing several expansions, you should try to be on Fangtooth which has the full ZFS 2.3 release because this has the expansion speed fix included. Obviously Fangtooth full release is not far away, so probably worth waiting for that before starting this - but first full releases often have some bugs, especially around newish functionality (e.g. containerisation or VMs) so evaluate the risks of this for your own situation.
Each expansion will then take a reducing amount of time. If you go the reduced-data route, the first expansion from 4xRAIDZ2 to 5xRAIDZ2 with 13TB of data should need to write c. 4.5TB on the new drive, so that should take c. 7 hours (though this could be a low estimate). Assuming that subsequent expansions are on the full 20TB of data on a 5xRAIDZ2, the remaining expansions might take c. 7 hours, 6 hours and 5 hours respectively.
At the end of this you will have all your data written based on either 2 or 3 data blocks and 2 parity blocks. Assuming you go the degraded RAIDZ2 route, 6 full records will containing 18 data blocks and 12 parity blocks i.e. 30 blocks to store 18 blocks of data. If you rewrite the data with a rebalancing script, 3 records of 6 data blocks and 2 parity blocks will use 24 blocks to store 18 blocks of data, so you will save 6 out of every 30 blocks used and reduce the actual disk space used by your data by 20%. It will be rewriting c. 3.3 TB per drive which should take c. 5 hours.
Note: All these time estimates are extremely rough estimates based solely on the new drive accepting 200MB/s sustained writes, that ZFS will write at this speed, and not allowing for seek times and metadata / TXG overheads. You should allow probably allow double this - so probably you can achieve two replications/resilvers/expansions/rebalances per day - so a total of (say) 3 days to achieve this.
Creating the pool manually with shell and setting up all that parameters might not be the ideal way for me. At least not now with my limited knowledge of CLI.
Maybe I will add one drive for the RAID Z1 pool and upgrade to a new storage pool with bigger drives later. In that way data migration is easier.