TrueNAS or ZFS USB HDD connection stable

I have a small Odroid-H3 based TrueNas system with a boot SSD and one 4 TB HDD that I extend with a 6 TB HDD from a friend. It’s primary use is as media system. The setup was working very well, so I did not bother to change/improve.
The 6 TB GDD recently started raising errors, so I decided to buy 2 HDD 6 TB drives. Now I try to create a robust setup by changing the disks and create 1 striped pool. My approach is to replace the faulty HDD by using a USB Hard drive Docking system. When this succeeds I can interchange the faulty drive occupying the SATA slot with the USB drive.

After connecting the USB Hard drive Docking system with the 6 TB HDD to my Odroid-H3 I notice that a job disk.sync_all runs continuously every few seconds.

The outputof lsblk -o NAME,SIZE,FSTYPE,TYPE,MOUNTPOINT:
NAME SIZE FSTYPE TYPE MOUNTPOINT
sda 5.5T disk
└─sda1 5.5T zfs_member part
sdb 3.6T disk
└─sdb1 3.6T zfs_member part
sdc 5.5T disk
zd0 32G disk
nvme0n1 931.5G disk
├─nvme0n1p1 1M part
├─nvme0n1p2 512M vfat part
├─nvme0n1p3 915G zfs_member part
└─nvme0n1p4 16G part
└─nvme0n1p4 16G swap crypt

From this list sda is giving errors and sdc is the new 6 GB HDD.
The sdc connection is not stable, it seems to go on and off, maybe related to disk.sync_all?
I tried to partition the HDD to GPT partition through disk (first g, then n):

root@truenas[~]# fdisk /dev/sdc

Welcome to fdisk (util-linux 2.38.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.

fdisk: cannot open /dev/sdc: No such file or directory

Occasionally this succeeds, but I don’t understand what is making the connection unstable. Occasionally ‘Replace’ (via Storage - Manage devices) occasionally shows a selectable ‘sdc’ but when I select it and click on ‘Replace Disk’ I get errors like:
Error: [EFAULT] Failed to wipe disk sdc: [Errno 2] No such file or directory: ‘/dev/sdc’
Error: [EFAULT] Unable to retrieve disk details for ‘sdc’
Error: [ENOENT] options.disk: Disk not found.

Probably the cause of all is the instability of the sdc HDD connection, but how can I fix this?
The USB Hard drive Docking system has its own power supply, so that should not be the issue.

I have been trying most days of last week to fix this. The problem occurred on TrueNAS SCALE Cobia. Hoping to resolve the issue I upgraded the system to Dragonfish-24.04.2.3 but it did not solve the issue. So I’m out of ideas now…
Help very much appreciated!

Kind Regards, Herman

Most people here probably would not call a striped pool being a robust solution, but a risky one. :slight_smile:

A setup with USB drives is generally not recommend, because the connection do not tend te be reliable. Your experience seems to support that.

The best way is probably to destroy the pool, create the new one and restore from a backup. If you do not have a backup, than you should seriously consider that, given you plan to use a stripe.

What type of setup do have now, a mirror? How did you add the USB drive?

Until now I only have a rsync-ed backup on a Synology DS, so no mirror. A mirrored drive would be robuster, but also uses a lot more storage space. That’s why I prefer striping over mirrorring.

I added the USB drive and rebooted the Odroid/TrueNAS to be sure it is not hot-plugged.

Firstly, I should state that there is not really sufficient information here to be certain that the advice given is correct, but I will do my best with the information provided.

As prex02 says, USB drives are neither recommended nor supported because a USB connection can be unreliable. As you are seeing the USB connection is actually being unreliable and this is NOT a basis on which you can resolve the issues with your system.

As prex02 also says, striped but unmirrored drives are not a good solution for long term storage of your precious data. Any single corruption and you will lose literally ALL your data at one go. Additionally, ZFS will not be able to correct any minor errors in the data caused by bit-rot or a failing drive. If you value your data, then you need to be prepared to invest extra in redundant drives.

So if a single 6GB is not big enough to hold all of your data, then I would suggest that (if you still can) you return both 6GB drives and purchase 2 bigger drives that will each hold all your data (plus expected growth) and which you can use to create a mirrored pool.

My advice would then be to find a temporary 5+ SATA Port computer that you can put your boot drive, the two old data disks and the two new data disks into, and then use that system to create a new pool and migrate your data before moving the boot drive and new disks back into the original system. At a pinch you could use a 4x SATA port system and one new drive, and then mirror it after you switch back to your original system.

Alternatively if you can find another system with enough disk space you can copy all or some of your data off in order to be able safely to remove one or both of the existing drives.

Here is as much general information that I can provide that might help you think of a way to resolve this with only your existing system:

A 2-wide “striped” pool should (I think) consist of 2x single-drive vDevs. (sudo zpool status -v will give detailed information.)

Here are the things that I think you can potentially do non-destructively on your existing system with your existing pool (but I may be wrong):

  1. Add a mirror to each of the vDevs. So if you have a spare port you can add a mirror to the 6TB drive and then remove the old failing drive. However this will NOT help you create a redundant mirrored pool - because with only 3x SATA ports you need to remove the 4TB drive in order to add the second 6TB drive and mirror it.

  2. If there is enough space on one of the two vDevs to hold all the data, maybe tell TrueNAS to remove the other vDev by migrating data across. Obviously you don’t want to be moving data to the old 6TB drive, so unless you can move enough data off to another system to make it all fit on the 4TB drive this won’t help.

  3. You could export your system configuration file, and then temporarily replace your existing boot drive with a 6TB drive, plug in 2 USB drives, one with the TrueNAS installer on it, the other to act as a temporary boot drive. (A USB stick might be useable temporarily as a boot drive.) Reinstall TrueNAS on the USB drive, import your system configuration - then you have the ability to mirror the 6TB vDev and then remove the failing drive.

Sorry I can’t think of any more things that might help.

2 Likes

Thanks for your excellent overview of my options!
Reviewing them I would like to use option 2: I can offload data to 3,5 TB so it should just fit on the smaller drive/VDEV sdb.

How do I “tell TrueNAS to remove the other vDev by migrating data across”?
I only see 3 options: Extend / Remove / Offline
And when I click ‘Remove’ it does not offer me options to more the data…

There is no option. ZFS will just do it.

1 Like

I may look pretty stupid, it sounds magical, but I have bever done this before: do I need to choose “Remove” and acknowledge, or do I need to choose “Offline” and acknowledge to make ZFS migrate the data from the failing VDEV to the other VDEV (since it is not mirrored) ?

“Remove” cleanly removes the device (applies only to mirror pools, of which stripes are a degenerate case).
“Offline” takes the device off… and would fault the pool since there’s no redundancy.

And to be clear, I believe “fault the drive” means break it irrevocably and lose all your data (but I may be wrong and you can online the drive again and recover).

The drive has errors, but so far scrub works fine so the pools topology is still ok.
So since I don’t have a mirrored setup removing the VDEV will fault the pool, so ZFS will not do it and I will have to find another way?

As @etorix said, you can do a Remove but not an Offline.

You have a mirror layout: Think of a single-drive vdev as “1-way mirror”.

You can remove any vdev in your stripe, provided that there’s enough space on the remaining drives to move the data from the drive being removed. Of course, not having redundancy means that any incident will result in data loss.

With only 2 SATA ports, an unreliable USB connection and a disk that is already spitting errors, I think your best option is to follow @Protopia 's advice: Find any computer with enough SATA ports to plug all the drives and do the transfer there.

Just to show that it works…

Thank you all for directing me and showing me what True~NAS SCALE can do.
I understand @Protopia 's advice is the way to go, but I don’t have a computer at hand with 5 SATA ports.
So… is it also possible to do this step by step:

  1. Power off TrueNAS SCALE server
  2. Unplug both VDEV’s/HDD’s
  3. Copy/clone both drives (to the new HDDs) using the USB Hard drive Docking system
  4. Plugin the newly copied disks
  5. Power on TrueNAS SCALE server

I am unclear what sort of drive your current boot disk is. Is it a small SATA drive or an on-board eMMC drive? In other words do you have 2 or 3 SATA ports on your MB?

As for your proposed method presumably using some sort of non-TrueNAS disk cloning software, as I understand it ZFS is quite particular about e.g. disk serial numbers. I am not expert enough to know whether after sector-by-sector cloning you can then import the copies, but even if you did this you would still end up with a striped pool rather than a redundant one.

IMO you really need to create a new pool and replicate your data to it.

@Protopia I have a small TrueNas system with a ZFS boot SSD and 2 drives: 4 TB and 6 TB.
So the boot SSD is indeed an on-board eMMC drive.

If tricking ZFS into accepting the new drives does not work I will start off with a clean mirrored pool and rsync the data from my Synology backup.

So you only have 2 SATA ports on your MB (because the eMMC is presumably soldered onto the board).

Because this means that you cannot free up a SATA port by moving the boot drive to USB then, assuming that you don’t want to squeeze all the data onto the failing 6TB drive because of the risk of it failing, if you cannot squeeze all the data onto the 4TB vDev and remove the 6TB vDev then it definitely makes the data migration challenging to solve - indeed finding another system with 4-5 SATA ports (or an eMMC and 3-4 Sata ports) might be the only real solution.

I can squeeze the data to 3.5 TB, but how do I move that to the 4TB VDEV.

If I can do this, then I can replace the faulty 6 TB with the new HDD, move the data to the new HDD, replace the 4TB HDD by the new 6 TB HDD and add this last HDD to the pool than I can have a mirrored pool?

That’s 4 SATA ports and whatever it takes to boot: cheap SATA drive on a 5th port, cheap NVMe in M.2 or USB thumb drive—this is only temporary, so we do not care about the lack of long-term reliability. eMMC is soldered on, so that would be a fresh TrueNAS install anyway and loading a configuration file.

As @etorix previously said Remove will move the data for you - you just need to wait while it happens.

Or eMMC for boot.

But you can do it with 3 SATA ports - one for a new drive, two for the old drives. Migrate the data, move back to the original system and add a mirror only then. Since you still have the original data on the original drives, you can live without redundancy until the mirror has resilvered.