Do I need to enable snapshot before transferring any data?

Hi Experts

Not sure a valid question though I am thinking after setting up the dataset and share. Do I need to setup snapshot first before start moving any files to the NAS?

Wondering say:

  1. Sever setup ready, without any snapshot created already copied data file to shares in the morning 10 am.
  2. Then afterward, setup a snapshot task. Though not enable create snapshot with empty. An hourly schedule.
  3. 1 hour later, first snapshot created. But if no files update/add/delete, this one will be zero size?
  4. Then 1.5 hours later, ransomware attacked and locked files.

In that case if I go to the snapshot view and restore the snapshot done at step 3, with I get the pervious image? I am a bit puzzle as if the data already there even before the first snapshot created.

When you create a snapshot, think of it as ZFS that takes a notepad and lists all the files inside the dataset at that precise moment.
Then after a few hours you take another snapshot of the same dataset, ZFS moves to a new pages and writes any new file, then for everything that wsn’t modified writes “see previous page”.

If you revert back to the first page of ZFS’s notebook, zfs will tear any page that has been written after the page you want to be at.

Now, if you set your snapshot schedule withouth enabling “create empty snapshot” or similar (can’t quite remember the name right now), ZFS will not write a new page on its notebook unless something has been modified (or created anew).

Now, let’s assume you setup an hourly snapshot task and you set each snapshot to last only 1 hour; you also do not enable taking empty snapshot (or similar, cannot remember the exact name right now).
If you already have data on the dataset, the first snapshot will be taken; if you have no data, the first snapshot will not be taken.
After an hour, the second snapshot task wil be run: if you have modified any data (edited a file, written a new file), a second snapshot will be taken; if you have not modified any data, the second snapshot will not be taken. Anyway, after an hour the first snapshot will expire, leaving you:

  • With only snapshot #2 if you modified any data between the time the two snapshots are scheduled to run
  • Withouth any snapshot if you did not modify any data between the time the two snapshots are scheduled to run, leaving you vurnelable to ransomware.

That’s the reason we have that “take empty snapshot” option: when we want to always have a way to get back, we have to schedule our snapshot with at least one task allowed to take “empty” ones.
Usually when scheduling snapshots you layer them on top of each other, meaning:

  • you create hourly snapshots that last a day
  • you create daily snapshots that last a week
  • you create weekly snapshots that last a month
  • you create monthly snapshots that last a year (or a smaller number of time, point is it has to be greater than than the period the snapshot covers)

On such a setup, you likely don’t need your hourly snpashots to be taken if empty; on the flip side, you likely want anything above, sometimes including, daily snapshots to be taken even if empty in order to be able to revert to them at any point before they expire.

In this kind of situation[1], if you are hit by a ransomware you might lose just a single hour of work (as modified/new files).

  1. meaning my example, not yours. ↩︎

1 Like

Thank you so much Davvo. it seems to me that empty snaphot would be something important even though nothing change in time but in can setup as a more frequent time interval allows me to go back in time.

1 Like