TrueNAS Scale 22.12.3 Freezes System After Attempting to Delete

zfs get all | grep dedup should give you the results for this.

Deleting the deduped data from the dataset will result in a huge stall as your dedup devices try to purge that ~50GB or so of small, randomly-written data on the HDDs.

And unfortunately, you aren’t able to remove the dedup vdev - it’s there for good. Only way to get rid of it entirely is export/destroy of the entire pool - hence, the statement about shuffling around the cache drives to let you replace the HDDs with SSDs.

Yes - everything I described can be done from the TrueNAS web UI. Since you mentioned you have lots of the 800G EP (Enterprise Performance) Samsung SAS drives, and a few 1.92T/3.84T “enterprise value” drives, here’s my suggestion of steps in order.

Assumptions: your 120G boot SSD and your 480G metadata SSD are both internal to the system, and are on SATA-only ports. Unless explicitly mentioned below, all commands should be done from the web UI.

  1. Remove one of your cache SSDs from the pool, eg sda and then physically remove it from the system.
  2. Replace sda with an 800G SSD, and then select your metadata SSD sdz and choose to Extend it with the new 800G sda drive. This will turn your metadata vdev into a mirror, and solve the most pressing redundancy problem. You’ll still only have single-drive fault tolerance vs. the 3-drive of your main pool, but it’s something.
  3. Repeat the process of removing the second cache drive sdc from the pool and the system, but replace it with a 3.84T SSD (since they’re labeled as “value” I’m preferring the larger capacity for wear-leveling and performance reasons) - but only do this if you’ve got at least two of them.
  4. Select your dedup vdev and Extend it with the new, 3.84T SSD sdc drive - let the resilver process complete.
  5. Select one of the 1.8T HDD’s in your dedup vdev and choose to Detach the drive. Once it’s been detached, physically replace it with another 3.84T SSD, and Extend the dedup vdev onto it again.
  6. Detach the last 1.8T HDD from your dedup vdev. Your dedup performance will be massively improved once this last HDD is removed, and you should now be able to delete the data without huge stalls - you will still be limited by the speed of your SSDs, but that should be significantly better than an HDD.
  7. For the last remaining slot, I’d suggest adding in another 800G SSD, and using Extend on the metadata vdev again. Then Detach and remove (at a later time) the 480G SATA SSD, as it’s likely less suited for metadata handling than the 800G SAS drives.

Again - to be frank, I’d prefer you perform a full data offload, and rebuild the pool without deduplication in place at all. You aren’t getting a huge win from it, and it’s requiring a significant amount of additional support in terms of the special vdevs.

3 Likes

Thanks for that. Dedup is off everywhere else.

Well…my hope is that having my customer delete his own data after hours when few are using the system will take care of that as well as the removal of the dataset. If that dedup stays there because Pool1 has dedup turned on then yeah, we’ll just have to wait.
Thanks for the plan of action. I’m going to try to do 2 in parallel:

  1. migrate data off and remove Pool1 and recreate with other drives and
  2. find and build another TrueNAS to similar spec.
    This may take a while.
    Thanks for all the input.
    Jack

How about this for an idea.

Repurpose one of the existing SSDs as a third dedup vdev mirror - and then drop both the HDDs from the vdev leaving only the SSD.

This should at least improve the dedup performance.

hmmm, yeah…that’s interesting…then I would have 2 slots open when I remove the other dedup drives, and can use them to create a 3-way metadata mirror? Or add them to metadata?

Yes, but at the cost of now having the single-drive dedup vdev be a single point of failure as well as metadata (or instead, if you have a mirror3 meta and single-drive dedup)

Strongly advise that you target a mirror for both metadata and dedup vdevs. The internal 480G SATA SSD could be repurposed as L2ARC if it’s really needed, but with 24 front bays and 19 data disks, you can pull off mirrored metadata + dedup + a single SAS log device all with higher-performance SAS drives.

Copying photo/video files to SMB/NFS shares does not require sync writes, so you’d be better with sync=disabled and no SLOG at all.

Edit. corrected syntax

Etorix,
I assume you’re referring to the sync setting on the dataset where it defaults to standard, but you can select always or disabled.
If so, then never sync write would be disabled, correct?
Jack

Why do you think that making sync requests a no-op is safe for file sharing protocols (e.g. lying to clients)? General rule of thumb is to never switch to sync=never.

4 Likes

In this case, the client systems that @JackOfAllIT are using are MacOS which do send some sync writes over SMB, and there was a mention of switching to NFS.

As mentioned by @awalkerix it’s generally a Bad Idea to tell the storage side to explicitly lie about the safety of a sync-write - even a file-sharing workload might ship the bulk-data as async, and then say “okay, now push this critical piece of metadata synchronously” - you don’t want it to misrepresent the safety of that latter piece.

2 Likes

The storage details look like something listed on an invoice. If it’s from the same person who built this system for you perhaps it would be good to see what they actually put into the system.

Could you please post the output of lsblk -So SIZE,MODEL ? It ought to produce a list of sizes and drive models.

I think he said this possibly because there is a difference between database transactions and copying over a picture. If my picture copy has a problem I can just do it again, since most picture/video copies are manually done. Database calls are usually programmatic and also much smaller.
But I don’t understand the nature of your comment. Async and Sync have been options and debated for decades. Async is incredibly faster in most situations than sync. Outside of storage, Async is what the TCP/IP is based on, so it’s not bad.

So it’s hard to tell what drive is what mode/type from this output which is why I listed it in capacity form.
Thanks.

1 Like

Ah, a raid controller, of course.

I’m pretty sure I just heard a BINGO somewhere in the back there.

Odd that you’ve got one drive that pushed the actual model number through (ST1800MM0008) and the remainder are RAID volumes.

But RAID controllers in general are another discouraged hardware option, and being able to migrate away from/change this “in-line” is dependent on whether or not the Cisco “RAID” card is actually wrapping them into virtual partitions/virtual disks, or just masking the model/SN with its own.