Storage usage almost at 100%, what’s going on here?

Hello,

I only have two VMs, and one of them is Nextcloud where I’ve stored 600GB of data. Now I created snapshots for the entire pool, and suddenly my storage usage jumped to almost 100%, even though it was around 60% or so before. I don’t remember the exact number anymore.

What happened here? Is my entire TrueNAS installation now unusable because the storage is completely full? It’s also not possible to shrink a zvol.

To me, it looks like all the data was duplicated 1:1 during the snapshot process. For example, the 982.05 GiB occupied by Nextcloud-Data seems to have been stored somewhere again, which resulted in 2x 982.05 GB being used. But I thought it would use the existing data of a zvol and only save which data had changed?

Could this be related to encryption? On my old TrueNAS installation, I only encrypted the zvol directly and not the dataset.

Please help me. :frowning: I don’t know how to fix this. The Nextcloud data is there now, and of course, I can’t create a new zvol with less storage. I would prefer to allocate only 1 TiB to Nextcloud-Data… I gave it too much space during the setup. But even then, if snapshots are created again, my entire system would become full and broken again… I just don’t understand.

Thanks and best regards






Old Truenas Installation:


To be honest with you, I don’t know much about TrueNAS and VMs.
I belive that TrueNAS is a great NAS and a not so great Hypervisor on top of that.
So these are just my guesses according to your pictures:

Snapshots should initially not use any storage at all.
The Data Snapshot only uses 507MB data.
My guess is that this was just a coincidence. Storage numbers in TrueNAS are not updated right away in my expirience.

Nextcloud Data is a 3TB thick provisioned zvol.
So it will use 3TB of storage. Even though, it is mostly empty inside (2.2TB free).
I don’t think you can shrink a thick provisioned disk.

Two additional tips I can give you:

  • Don’t bother with encryption. It makes your life way harder and it does not really help since passphrase is stored on the system itself. Unless you are willing to unlock datasets by hand at every boot, encryption does not make a whole lot of sense.
  • Don’t use HDD for VMs. Don’t use RAIDZ for VMs but mirrors instead. Only use zvol (block storage) if you absolutely have to. Datasets have way less drawbacks.

As a universal statetement this is wrong.

It is correct, on the other hand, if the scenario in mind is the theft of the entire machine from a residential home. So I assume this is what you had in mind.

But there are at least 2 other scenarios, which I can think of immediately. And both are way more relevant for enterprise storage, where machines are located inside a heavily protected data center. That is what TrueNAS is made for.

Scenario 1: Disk failure and replacement by the vendor. In many jurisdictions and/or industries disks contain sensitive or even secret data that under no circumstances may leave the data center. Combine that with pre-emptive replacement or human error and you have a real issue. Disks would need to be physically destroyed and could not be refurbished. Warranty would be useless. Would you be willing to spend 1k+ USD/EUR per day on new disks as the data center operator?

Scenario 2: The same underlying challenge (data must not leave) when machines are replaced at the end of their planned lifetime. You cannot sell those drives but must have them destroyed by a certified company. I know a company that just had hundreds of TBs of high-end SSDs destroyed, because they were not allowed to sell or donate them, due to lack of encryption.

I don’t want this reply to be understood as bashing. Rather as encouragement to provide context and think beyond one’s own situation.

Thanks!

2 Likes

@FalseNAS Please post the output of

zfs get -t volume refreservation

Please copy and paste text inside of code tags like in my posting here and do not use an image. Thank you.

1 Like

No offense taken :slight_smile: And I agree with your points.
I just assumed that OP is using it in a home lab context.

Homelabbers tend to like encryption, because they are scared for their ISOs.
When in reality the only real advantage they got is that they don’t have to wipe the disks before selling/trashing them.

1 Like
root@truenas[~]# zfs get -t volume refreservation
NAME                                       PROPERTY        VALUE      SOURCE
Backup/Nextcloud-System                    refreservation  none       default
Pool/VMs/DNS/DNS-Data                      refreservation  32.5G      local
Pool/VMs/Mail/Mail-Data                    refreservation  65.0G      local
Pool/VMs/Nextcloud/Nextcloud-Data          refreservation  2.03T      local
Pool/VMs/Nextcloud/Nextcloud-System        refreservation  130G       local
Pool/VMs/Paperless-ngx/Paperless-ngx-Data  refreservation  65.0G      local
root@truenas[~]#

Try

zfs inherit -S refreservation Pool/VMs/Nextcloud/Nextcloud-Data

to turn the zvol from a preallocated into a sparse one.

Find some way to monitor your actual pool usage. When you fill the pool to 100% you will lose data.

Why are you keeping data of applications like Nextcloud or Paperless-ngx in zvols (intransparent large monolithic chunks of data) instead of datasets which are much easier to manage in terms of snapshots, backup, restore, space management, …?

2 Likes

From where/what would it inherit? There is no zvol further up the hierarchy. The root dataset of the pool can never be a zvol, either. (I never played with zvols or reservations, so I’m genuinely curious.)

inherit -S effectively means clear the property.

2 Likes

Okay, so I’m supposed to convert the volume into a sparse volume, which will make TrueNAS no longer see the reserved 3 TB as used storage. But I still have one question: Where do the 3 TB come from? I only assigned 2 TB to the ZVOL when creating it. So 2 TB should be reserved and marked as used by TrueNAS. The volume size is also 2 TB. But then there’s something about 2.99 TB and 2.24 TB. I’d like to understand that…

And I created ZVOLs because I run all the services in a VM, and that’s how it’s supposed to stay. ZVOLs are required in such cases.

If you add the refreservations of all the volumes you end up with more or less the 2.24T you see. The difference to the 2.99T are probably your snapshots.

Try

zfs list -t snapshot -r Pool/VMs
1 Like

You forget that ZFS is Copy-on-Write: It needs space, and retains space, for every write operation, and even every change, including deletions. And block storage in a zvol is less efficient than file storage in a regular dataset.

Maybe you should consider moving datas to datasets mounted into the VM, and keep the zvol as small as possible.

2 Likes

Kannst du mir das vielleicht auf Deutsch kurz erklären? Ich habe die Rechnung nicht ganz verstanden und ich würde gerne die Zahlen dort besser verstehen…

root@truenas[~]# zfs list -t snapshot -r Pool/VMs
NAME                                                              USED  AVAIL  REFER  MOUNTPOINT
Pool/VMs@auto-2025-01-21_04-00                                     72K      -  2.57G  -
Pool/VMs/DNS@auto-2025-01-21_04-00                                  0B      -    96K  -
Pool/VMs/DNS/DNS-Data@auto-2025-01-21_04-00                       194M      -  2.02G  -
Pool/VMs/Mail@auto-2025-01-21_04-00                                 0B      -   192K  -
Pool/VMs/Mail/Mail-Data@auto-2025-01-21_04-00                       0B      -    88K  -
Pool/VMs/Nextcloud@auto-2025-01-21_04-00                           80K      -   192K  -
Pool/VMs/Nextcloud/Nextcloud-Data@auto-2025-01-21_04-00           508M      -   981G  -
Pool/VMs/Nextcloud/Nextcloud-Data@auto-2025-01-22_01-45           331M      -   982G  -
Pool/VMs/Nextcloud/Nextcloud-System@auto-2025-01-21_04-00        3.25G      -  22.2G  -
Pool/VMs/Paperless-ngx@auto-2025-01-21_04-00                        0B      -   192K  -
Pool/VMs/Paperless-ngx/Paperless-ngx-Data@auto-2025-01-21_04-00     0B      -    88K  -
root@truenas[~]# 

I have deleted now the snapshot for Nextcloud-Data zvol and now I have again ~900GB more available. So why exactly do TrueNAS this? The snapshot was only 500MB big.

Did you delete only one snapshot or multiple snapshots? It shows that you had two snapshots for Nextcoud-Data. Did you delete them both?

Yes, both

This can explain the “ZFS math” when it comes to snapshots.

Consider the “boxes” as “storage blocks” that consume space. Consider the “colored tags” as “pointers” that represent the blocks.

At the very end, you will notice this part:

What if you were to rip off the red, blue, and green sticker sets? How much room would free up?

Let’s do the math.

Ripping off the red sticker set would supposedly free up two boxes.

Ripping off the blue sticker set would supposedly free up one box.

Ripping off the green sticker set would supposedly free up one box.

2 + 1 + 1 = 4

Therefor, ripping off the red, blue, and green sticker sets will obviously free up four boxes.

Let’s try it now!

What’s this?! IT FREED UP TWELVE BOXES?


The supposed “used space” of each snapshot is small, because multiple snapshots overlap with the same blocks. That’s why one of your snapshots will say it consumes “300 MiB”, and another will say “500 MiB”. If you remove one or the other, then yes, you will free up only 300 MiB or 500 MiB, because only those snapshot’s unique blocks will be freed from the pool.

If two or more snapshots are referencing the same “deleted” blocks, of which they greatly overlap, then you will “surprisingly” free up a lot of space on the pool by deleting all those snapshots together.

Edit: I already wrote this post but I guess you already posted the explaination for it, right? I will read. @winnielinnie Hopefully it is really for Dummies like me

I checked my old Truenas installation. There I used a sparse zvol. But I want to understand why there is not so much space consumption after I created a snapshot for a sparse zvol. I guess because the sparse zvol is only that big which is atm needed. And when its no sparse zvol, then there is a big block with 1000 GiB. Right? But why do Truenas then add another 1000 GiB?

I thought that TrueNAS would use the existing volume as a reference and only document the changes in a snapshot. But why does it seem like TrueNAS is storing the 1000 GB of data somewhere else and using that as a reference?

I’m sorry if I’m asking such silly questions. I’m just finding it very difficult to understand right now. I’m sorry.

Now I’ve switched the zvol to Sparse, but the Volume Size is still showing as 2 TiB. I’d like to get rid of that. Resizing isn’t an option. So I tried moving the zvol to a new one. However, that didn’t work due to encryption. I kept getting the error:

“cannot receive new filesystem stream: zfs receive -F cannot be used to destroy an encrypted filesystem or overwrite an unencrypted one with an encrypted one”

This happened every time I tried copying the data or snapshot to a new zvol with a 1 TiB volume.

As a workaround, I’ve temporarily moved the data to another pool. From there, I’m now copying it to the Nextcloud dataset in the zvol nextcloud-data2. But I suspect that it will also copy the zvol details along with it, meaning I’ll end up with the same zvol with a 2 TiB volume again…

Does anyone have a solution? I’d really prefer not to completely set up the Nextcloud VM again. It was also a bad decision on my part to set the volume size to 2 TiB—I didn’t think it through. 1 TiB would be more than enough for the next few years, at least until 8 TB SSDs become affordable. Ugh…

Edit: Oh, well maybe my idea was good. So far there is no information about a volume size of 2 TiB on the zvol I copied from the Backup pool to the new Nextcloud-Data2 zvol. I really hope I have not destroyed any data…