What's the safest way to transition from CORE to SCALE without losing data stored on pools?

I use USB for fast backup

  • attach drive
  • import pool
  • start replication
  • export pool
  • disconnect drive

Is easy and fast, and i have done a that lot of time… But once i face an error during the process (probably due to a random USB disconnection), the destination pool get degraded/corrupted, and i was unable to do anything (neither export pool or disconnect disk), until i reboot the server…
In my case, i transfer less than 100gb, don’t wanna think about transfer TB of data, and how many things can go wrong.
This Is (principally) why i setup another cheap server (spent literally less than 100€), and i use It for weekly replication of data

2 Likes

If the destination pool got corrupted, then it would not come back after a reboot.

I have a feeling that what is happening is that the default failmode property for a zpool is wait. You can get the current value using sudo zpool get failmode pool-name. According to man zpoolprops:

failmode=wait|continue|panic - Controls the system behaviour in the event of catastrophic pool failure. This condition is typically a result of a loss of connectivity to the underlying storage device(s) or a failure of all devices within the pool. The behavior of such an event is determined as follows:

  • wait - Blocks all I/O access until the device connectivity is recovered and the errors are cleared with zpool clear. This is the default behaviour.

  • continue - Returns EIO to any new write I/O requests but allows reads to any of the remaining healthy devices. Any write requests that have yet to be committed to disk would be blocked.

  • panic - Prints out a message to the console and generates a system crash dump.

So next time this happens, try issuing a sudo zpool clear pool-name and see if it recovers the pool.

However your description of why you stopped using USB devices for backups is exactly an example of the potential behaviour of USB drives - and hence why there are risks of using a USB SSD as a boot device.

1 Like

I guess I’m living on the edge.

I use USB enclosed drives (WD “white label” and Seagate Exos) for simple offsite, cold storage backups.

They don’t run continuously, and are only powered on when I physically plug them into the server to run a replication backup.

What makes them “USB” or “external” is the enclosure, whereas the drives themselves would be considered proper for a NAS server.

You know these are excellent because they are “ultra-fastt”.

1 Like

WD “white label” drives are quite often SMR (or at least the 4TB or smaller ones are).

And although the drives themselves may be “enterprise” drives, the “enclosure” contained e.g. a USB to SATA bridge, which is a whole extra layer of technology that may or may-not be “enterprise” quality, by which I mean highly reliable and fully functional.

I am NOT saying that USB enclosed drives should not be used for backup as single-drive zfs pools (or indeed ext4 formatted disks), just that there are very good reasons to avoid using them for live data storage, and to try to avoid using one for a boot drive.

2 Likes

All of mine are 8TB+.


Once the initial full replication is complete, subsequent incremental replications barely add to the drive’s “poweron” hours, and do not put the USB controller under much stress. So once you get past the first full replication, it’s fairly smooth sailing from now on for the occasional incremental backups.

Yes - this is an excellent use of ZFS replication. :smiley:

What probably can ensure much stability, at the cost of Speed, Is use 2.0 USB instead of 3+… Make sense?

1 Like

USB2 instead of USB3 would make sense, however I haven’t worked out how to do it. So if anyone knows how, I would love to know…

Why is USB 2.0 more stable? Isn’t the problem disconnects due to the connectors, not anything related to the protocol?

It is the polling protocol. Connectors are fine.

Plug into a physical USB 2.0 port whenever possible. Most motherboards still have a pair of USB 2.0 ports for keyboard+mouse and/or a 9-pin header.

I just realized another problem: i have done a replication FROM usb to my pool (in this case, can’t do nothing different, no more sata available); exported the pool without problem, but the disk is still showing there from days:

but this disk is not present in camcontrol devlist

<WDC WD10EFRX-68FYTN0 82.00A82> at scbus0 target 0 lun 0 (pass0,ada0)
<WDC WD10EFRX-68FYTN0 82.00A82> at scbus1 target 0 lun 0 (pass1,ada1)
<WDC WD10EFRX-68FYTN0 82.00A82> at scbus2 target 0 lun 0 (pass2,ada2)
<WDC WD10EFRX-68FYTN0 82.00A82> at scbus3 target 0 lun 0 (pass3,ada3)
<WDC WD10EFRX-68FYTN0 82.00A82> at scbus4 target 0 lun 0 (pass4,ada4)
<WDC WD10EFRX-68FYTN0 82.00A82> at scbus5 target 0 lun 0 (pass5,ada5)
<AHCI SGPIO Enclosure 2.00 0001> at scbus6 target 0 lun 0 (pass6,ses0)
<Min Yi U YZWY_TECH 0> at scbus7 target 0 lun 0 (da0,pass7)

didn’t try (and honestly, wanna avoid) a reboot. There is something else can do?
Tried connecting another disk, is not recognized (system continue propose to create a pool with that ghost disk)