I need help with how to backup TrueNAS ZFS-pool data to multiple USB drives

Thanks all good replies

In the past I have tryed several things,

  • Splitting the data in the pool to directory segments (so data can be split manually to fit multiple difrent size drives) and copy data from them separately to diffrent usb drives.
    – When using NTFS, it works great even with SMR drives and allows using compression at the same time.
    — When doing the copy with Teracopy I can also use checksum counting to test that the copy is ok and it can be verified.
    – Drives can be disconnect when not in used and stored to safety.
    → But lot of short backs, I would really want to move on to better way of backuping.

  • If using Cobian Backup / Cobian Reflector I can also take 7zip compressed Full+Incremental backups and automate what to copy, where to copy and how many copies to keep, issue commads after the copy (sleep drives), etc.
    – Its free, but not open source
    — Heading to eol if it does not find programmers to continue the project :confused:
    → Maybe one could be interest to port it to Linux/Freebsd and make it part of Truenas? :slight_smile:

I am having ideas of a script…

  • Where is a list of the drives used for backup,their “64-bit World Wide ID” and what backup group/plan the drives is parting.
    –This allows to detect what to do to to all the backup drives when they are connected or needed to be mounted etc.
    – In the first iteration go for manual before planning anyting automatic to do when usb drives is connected.
  • The use compression to split the data to multiple drives, (like we used compression in the past to split data to multiple floppy disks :slight_smile: Compression also adds CRC so it can be verified.

I am not shure jet is it any good idea to try to use someting like

  • mergerfs to “combine” the disk.
    – users are requesting mergerfs be made part of TrueNAS

  • SnapRAID Looks also a promising, specially the comparision

  • “spanning drives” method to just add drives together, they would still nee to be able to work by them self.
    – Any raid-0 type is out of the questions.

Also thinking that maybe file list + checksum would be good to be take and used to as a log/db where to compare changes so incremental backups could be easily taken whiteout needing to read all the backup disk involved.

  • This would really help that you would not need to connect all the usb drives to just backup the changes to new usb drive etc.

Still in the alpha stage of just thinking about it and wondering still that there has to be way someone has alredy been thinking about this in the past and solved it alredy way better than I can even imagine…

Somewhat related …

I few years back, still in the FreeNAS days, I had a perl program that would:

  1. take a ZFS snapshot
  2. prompt the user to mount a filesystem at the target location, for which I used a single disk. IIRC I used, at different times, ext4, zfs, and zfs with copies=2. These were SATA spinning-rust disks either plugged into eSATA ports, or using hotswap 3.5" bays
  3. use tar(1) to create a multi-volume archive, optionally with fast compression. This would write the tarball in multiple parts to the target filesystem
  4. after each part, check to see how much disk space is remaining; if it was less than the “part size”, then it would send an email to the operator that another disk was needed, and leave the tar pipeline paused
  5. the operator would be expected unmount the disk, mount a new one in its place, and then run a command that tells the script it can continue
  6. the script then continues to write out parts to the next disk, rinse and repeat until done

Note that there is no requirement that the removable disks are the same size, or necessarily of the same filesystem type; only that there is enough space on the set of removable disks to hold the entire tarball.

If this sounds like the traditional way to use a standalone tape drive, it’s not a coincidence.

The down side is that it was only good for a full backup and was time consuming (both writing the data and doing the disk swaps); I was using it only for DR copies. Trying to restore a subset of files using this method would be quite painful.

I’ve also used Bacula to do traditional file-based backups for FreeNAS systems at a Former Employer, again where the amount of data was much greater than the size of a single drive or tape. Luckily, that was using a tape library.

I’m now back in the situation where, like the OP, this is now a problem for me again so I may have to dust things off and figure out the best current approach. I’m not currently in a position where a tape library is in the budget, so it’ll probably be back to the disk swapping approach. I suspect that cloud storage for backups would be too expensive/slow if a full recovery is needed; pulling 80TB from AWS would be upwards of USD 7500 in network fees alone.

Maybe I’ll look at adapting the removable disk strategy to be a virtual library for bacula, possibly using consolidated backups if I’m crossing a slower network link. More thought is required.

1 Like

I have a large volume shared as SMB on truenas SCALE and mounted on my Linux Desktop. I also have a hotpluggable SAS disk adapter plugged into USB as well on the Desktop. To backup the SMB share on truenas I run the following

sudo tar -v --create --tape-length=2790G --file=/dev/sda /media/shared/Media/

/dev/sda is the hot pluggable 3TB (see tape-length) disk and /media/shared is the SMB share.

When the first disk is full, it will prompt you to swap the disk. Once the next disk is ready, press Enter to continue at the prompt.

Note that this uses the entire target disk as a block device. It won’t create any partition nor file system on the target disk, so don’t partition or format the backed up disks.

To RESTORE/extract the files, connect the first backup disk and use

sudo tar xvf /dev/sda -C /

It will prompt you to swap the disk similar to the back up process.

This may not be a perfect solution for your use case but maybe you can modify it to make it work.

1 Like

Been backing up to USB drives for years. One can automate, so, a script can detect what is attached and zfs import them. Then run the backup, then zfs export them. You can even use multiple sets, and based on the name attached the script can do the appropriate thing. I store my backup sets at a bank. It’s not my only backup method, it’s one of them.

1 Like

I’m sick and tired of everyone spouting off “YOU MUST BACKUP!!!”, and in the same breath, “BUT ONLY TO ANOTHER ZFS SERVER!!!”. We have been backing up to “tape” for generations now, and it’s worked fine in many, if not most, properly configured scenarios. Replace “tape” with “USB HDD” and I see no reason that this wouldn’t be a perfectly acceptable backup solution. Instead of screaming at people to “DON’T DO IT!!! ONLY BACKUP TO JSTOR/ZFS-SERVER!!!” how about getting constructive with the answer?

I have been working on just such an answer for several years now, and I’m apparently not smart enough to solve the problem. I figured out a way to use unix ‘tar’ to span volumes, but I never did figure out how to store a local file to catalog that tar backup for doing subsequent incremental backups instead of a complete new tar of everything. I still think that tar has the tools necessary to do this, and if you keep two identical backup sets, then a single drive failure still won’t kill your data, and doing it this way would allow for one set to be local, and one off-site, JUST LIKE ALL THE SCREAMERS KEEP SCREAMING: “3-2-1!!!” Which is, after all, what we are TRYING to do here. :wink:

Sorry if this sounds angry and frustrated; it’s probably not aimed at you, unless you’re one of the many who keep telling us that we cannot have backups without another investment of $15,000+ to build an identical server, and that we are IDIOTS for trying to make backups using tools we have on hand, e.g. 100x 8TB USB HDDs. Isn’t A BACKUP, ANY BACKUP, better than NO BACKUP? ¯_(ツ)_/¯

So, if anyone knows how to configure a pile of USB HDDs (or a pile of SATA HDDs that can be cycled through a USB-SATA enclosure) and the proper syntax for using tar (or if there’s another tool to better do this) please let us know. THAT is the answer I suspect OP is looking for, and definitely the answer I’ve been looking for for 20+ years.

FWIW, there used to be a tool on Macintosh called something like Retrospect that would do exactly this, albeit I only used it with tapes, but I’m pretty sure it would have accepted a bog-standard USB-HDD as “a tape”. I have also looked into Bacula or Amanda, but never could figure out how to implement them properly, although I still think Amanda might be The Solution®. I should probably start on page 1 of Amanda’s documentation, and read all the way through…then come here and answer the question. :-p

@captain
I don’t see that as being the answer for the OP. They wanted automatic and unatended. This throws this solution, at least based on the original question out. However I would agree that tape system generally work great. Used them when i was working.

I do see how some other NAS would be a proper solution but not what the OP asked.

I don’t have a tested solution myself. Currently all my data that i must have still fits on a 4tb drive. But once I cross that boundry, I too will be looking for a fully automated setup.

Let me propose this as a possible way to do it…
Set a quota for a dataset that is less than the drive capacity. Then build a script to tar each dataset and place on the removable drive. One dataset per drive. But if you have lots of video content, do not tar that content. If it is video content, just manually make a backup of this content. How often does it change? Refresh the backup periodically as desired.

I guess unless making video content is your job, where it may change daily. Just some ideas.

Weird. I think I typed that up months ago. I don’t remember posting it though. I guess when the computer rebooted it posted.

Anyway, yes, just backing up subsets of data to smaller media is one solution. I am pretty sure there is a way to do “Retrospect type” backups without paying for “enterprise grade” solutions. Onward! Through the fog! :wink:

Honestly, I cannot recall anybody recommending this on this site specifically.
What others an I will recommend is that you can use any media, you prefer!
YOU can use DB disks, you can use tape, you can use tape banks, you can use Windows filesystem, linux file system, Free BSD, OR ZFS.
What we all tell you is that all of the above has some practicallity!
FOr example, Tape drives would be great, but the problem is, if you dont have access to one drive already then you have to decide, which finger you want to bíte:

  • Buy a modern drive with TBs or even 10TBs of storage capacity per cartidge (but paying 4-5000+USD just for the drive and fiddle with chaning the cartidges when they are full)
  • Buying a modern tapebank for 20 000 USD and dont worry any more
  • Buying an older system, and gaining some initial epnese reduction vs the storage spare per tape cartridge
    The first two solutions are not in my budget range, and any time I calcuated the full cost of such a system before, ALWAYS got to the conclusion that only for the price of the tape drive itself I can buy at least 2 sets of HDDs additionally to my original data set.
    So the reason, I skipped tape so far is finaincol.

Regarding BD disks.

  • They are relatively expensive,
  • They storage capacity is too low per disk compared to even a cheap HDD
  • DVD-RW and DB-RW STARTED their manufacturing qualities where CDs ENDED. (If you have a CD from the '90s it is likely, that you can still read it with success. Try this with a 3 year old DVD or BD disk.) → this means, you need to write at least double sets of the data, which doubles your cost, your time neede to execute the backup.

Any external drives are not playing well with ZFS (belive me, I tried it) Of course, you can also try it, but it will only frustrate you.

  • Using a USB thumb drive is even worse, since those are rally unreliable storage devices. (especially, if you buy a cheap one.) They can simply die just sitting on a shelf or being plugged in intos a PC.
  • Of course, if you have ANY filesystem that is more tolerant of these two above mentioned failiure types, feel free to use them as a backup media.
  • We just told to the OP that for his/her purposes it is not the most suitable solution.
  • This has always been my point, and I still stand next to it.

Yes, you can dumpster dive a 4-7th gen Intel core architecture CPU, brand office PC, install a windows 11 on it and attach your USB drives to it. ( if you are lucky, you can get the PC even free, its HDD is still in it and if you are more lucky, there is already a Win10 (Pro) activated on it, and you dont have to buy any licences, because, you can install Win11 with a valid Win10 (and if I am right with a Win7 too) key.)
It will work as a backup target. ( I would say, that it will work even great too)
(My daily driver is still a i7 4770k, after some modernisations)

I think you need to use the --multi-volume flag.

This is exactly what I used to do a ~20 HDD backup of the entire 100+TB NAS, and it seemed to work. The problem is then doing incremental backups… Which I think can be accomplished by doing a snapshot of the dataset, tar’ring that onto the pile of HDDs, then doing snapshot differentials and tar’ring those to subsequent HDDs. It’s not very clean though. :-/

There used to be a standard Unix tool for that, it was called dump and it knew how to span tapes and write catalogs of what was on the tapes and how to do (2 different types) of incremental backups. It had a companion tool named restore.

Each FS type needed a different dump, here is the man page for the Linux ext2/3 dump dump(8): ext2/3 filesystem backup - Linux man page

TrueNAS does not include a dump command, I suspect because it was never ported for ZFS.

But… why the insistence on USB attached? If you have a small pile of unused SATA drives, then why not get a cheap SATA enclosure and eSATA attach them? Yes, you will probably be fighting a SATA Port Multiplier, but that is better than USB attach for lots of devices.

Reply #1 wrote exactly that! :-p

Every few years I revisit this topic to see if anyone has devised a GOOD solution to using gear that we already have on hand to accomplish a functional, if not ideal, BACKUP SOLUTION for quickly approaching petabyte datasets. Almost every time, the initial, if not only, responses are “Build a second/third NAS or use someone else’s computers.”

While I do see the value in having an identical dataset on a functional standby computer, even if I do implement such a thing I still want an offline, offsite, complete backup of the dataset.

So, since I have about 800TB of USB HDDs that contain all the original files that are now organized on one 400TB server, I’d like to repurpose those HDDs for something useful… I have considered shucking them all, and using an external SATA hot-swap caddy to use just the drives, and then put them all in a secure firesafe somewhere. But there’s something nice about having a bog standard USB HDD that can just be plugged into anything in 40 years and read.

Then again, in 40 years no one will care about any of my data! :stuck_out_tongue_closed_eyes:

Well, that is a lot. My full storage space is way below 100TB.
It is organised to my main server, my local backup and my goeredundant backup in another country.
I dont know, how many and what size HDDs you have, but this solution might help you:

It is a JBOD case for not that expensive.
For this many HDDs it might be a suitable solution.
But that will be an energy hog, for sure and you can expect your HDDs to fail one at the time.
Maybe, you should sell all your HDDs and buy just some (you need about 6 to buils a 100TB, RAIDZ2 Vdev and you can expand that system every time, you have another 6 drives together)) 20TB recertified drives instead and build a lower footprint system.
And you can put all those drives into a 12bay, 3U tall server rack.
That will be able to home 200 TB of data each with you be able to use 2 random drives at the same time

Another option is to remove all those drives from their external cases and use them as internal drives. I use mine main data storage like that.

(And, my previous recommendation could work with a lot of HDDs (however I doubt that windows will be able to handle it and you still will have issues with failing drives)

FWIW, I took 10-14 of the 8tb USB HDDs and connected them to a Raspberry Pi. I formatted them as a btrfs volume (no RAID at all, as btrfs is apparently not doing RAID correctly yet). This has been running just fine for a few years as essentially a media server. Cheap, uses those stacks of old 8tb USB HDDs without any extra work (other than buying a couple of functional 7-port USB3 hubs). You do have to disable some subset of the USB stack to keep it stable, but no biggie. It’s still plenty fast enough to saturate the 1gbps ethernet and serve untouched TS video streams.

Just to try to clarify:

My reason for wanting to use these USB drives is that:

  1. I already have scores of them
  2. they should be able to behave as fairly stable “tapes”
  3. they do not require ANY $xxxx.xx proprietary “tape drives”, which would have to be bought in triplicate to maintain any certainty of future availability.
  4. they do not require a whole new pile of newfangled media, like a giant stack of LTO tapes.

Ideally, tar would write flat files to them, so that if one drive failed, any and all others could just be plugged into ANY bog standard computer to recover ANY given file.

Aside: My IDEAL would be that rsync would be able to do volume spanning, so I could just rsync every day and add a new HDD when the previous one became full!

For the moment, I actually have several 100TB+ NAS boxes of differing flavours. I’ll keep my “critical” data (photos, documents, etc.) backed up to each and all of them. But part of the point of building a 400TB+ TrueNAS SCALE box was to aggregate ALL of my historical data into ONE place, so I could (maybe/I hope/finally) organize everything, and maybe even (gods forbid! ;-)) CULL the chaff, and leave a decent legacy for whoever comes after me.

One thing that concerns me is HDD failure. I don’t think I’ve had a single Seagate 8TB BackupPlusHUB, out of well into the scores of units, fail. I have already had several Seagate Exos 24TB SATA HDDs fail just in the last year! Two in my TrueNAS (thank gods I run zraid3!), and one in my RAID0 48TB DAS enclosure on the Mac is making horrible ticking noises while I aggregate data off of 5+ other random USB HDDs. I have had every Apple RAID array corrupt and fail, taking critical data and many tears with it. So, IMHO, NEVER USE APPLE RAID!!! ;-p We shall see how the OWC RAID0 enclosure fares. I’m not depending upon it yet.

That brings me to this semi-conclusion. With HDD sizes now approaching 30TB per HDD, all of this may be moot. I could buy ten of those and have more space than all of these USB HDDs… I’ll may just make multiple btrfs mirror sets containing four-each of the 8TB USB HDDs duct taped together into a four-deep-mirrored 8TB volume, put “important” stuff on them and put them into cold storage, labeled well. And move on to the future with multiple $20k petabyte boxes that mirror each other for redundancy and failsafe volumes. :-p

PS: I don’t think that shucking is a good idea. They’re probably cheap SHINGLED HDDs and with my luck would suddenly start failing if mucked with too much. :wink:

If you mean the Apple X-Serve RAID hardware box, I used one or two of them. No better or worse than any other budget h/w RAID box of their time ( 2003-2008 ). They were very good when combined with macOS machines in a FC X-SAN.

If you mean the software mirroring built into macOS Disk Utility, I actually had Apple Support (yes, I had a support agreement) tell me that RAID functionality was not officially supported.

1 Like

This thread has wandered far from the OP.

I want to jump in with a couple observations.

  1. When you ask for advice here on the Forums, you will get lots of different advice and opinions. You are free to choose which to follow and which to ignore.
  2. Using USB devices for ZFS is not recommended, for a whole bunch of technical and a few operational reasons.
  3. I have used a Seagate 5TB USB3 “portable HDD” as a backup device on my NAS (I think it was still FreeNAS at the time, although it may have been raw FreeBSD).

As long as you understand the limitations of what you are doing, and accept them, and the consequences of them, go right ahead and do what you like.

The “backup” solution I settled on for a bunch of reasons, including cost, ease of management, and reliability is to have two separate TrueNAS systems replicate my data. I have only implemented the first of the two backup systems; it is local in my house on the same network as my primary TrueNAS. The second is in the works and will be located on the opposite coast at a friend’s house. Note that I had the backup servers already, they had been used for other projects, but they were no longer in use. I scavenged HDD from various servers as they were taken out of service (yes, I always have a hot spare in the system and a cold space next to it). It works for me.

Do what works for you. But, if you are not following TrueNAS (either official or “Forum”) Best Practices, then do not expect the community to be able to help when something goes wrong. Do not misunderstand me, in most cases the community will TRY to help, but if you were doing something outside the realm of what ZFS expects, there will be little the community (or anyone for that matter) can do to help.

P.S. I really, really like TAPE for backup storage, but it has limitations:

  1. Cost of entry
  2. Cost of operations
  3. Space (tape libraries can be big)
  4. Time to recover
  5. Need for often complex and expensive management software

But it has 2 very large advantages:

  1. Very well understood long term viability of data on media
  2. Low power consumption

I think this has been misunderstood several times in this thread, so let me attempt to clarify, yet again:

No one is suggesting using USB HDDs as a part of a ZFS pool. Not in any way shape or form.

We are only searching for a way to back up a properly built ZFS pool to another medium for catastrophe recovery.

again:
TrueNAS IS ZFS based system, as @PK1048 wrote above, this forum is for ZFS related topics.
Using any other file systems together with TrueNAS is partially or not at all supported.
Also, using USB drives for a ZFS pool is not recommended.
Therefore, the thing you are looking (Using USB storage for reliable storage AND RAID (like) solution under TrueNAS) for is not really possible with TrueNAS.
It, however most likely possible with other solutions.
That is why I recommended, you to get an old, windows system and try to use it as your backup.
However, it might not be reliable enough for your prupose.
You can also try any linux distribution with mdadm. (I have no real experience with that with USB drives though)

My last recommendation is Unraid.

I understand that, and I think you missed my point. If using a ZFS pool build on USB HDD is your best solution for backing up your data in a recoverable manner, and you understand the limitations, then you should use that solution.

While it has not been explicitly stated here, the best way to recover data from a lost ZFS pool is to have a replicated snapshot of the pool. Why is this the best recovery mechanism, because it preserves the data and the metadata (ownership, permissions, ACLs, etc.) of the original data, because it is easy and relatively fast to recover (a simple reverse replication). If you have a set of snapshots, then you also have a limited history of the data, which may be critical if the data was lost due to a malicious attack.

AAAAAAND… we are completely off-topic here.

If you cannot add to solutions for

HOW TO BACKUP TRUENAS ZFS-POOL DATA TO MULTIPLE USB DRIVES

then please just stop. Thank you for trying to help, but repeating “that’s not how to do it!” isn’t helping. (although adding that ZFS to ZFS backups has nice features is a good reminder of things we would lose by doing it the requested way. :wink:

There are plenty of threads for “How to optimally, by the book, properly, and socially acceptably backup ZFS to ZFS.” We (at least OP and I) want to create a cold-storage BACKUP of a TrueNAS ZFS pool/dataset to a large set of USB HDDs. period.

2 Likes