The Care and Feeding of SSDs - TRIM and Charge Refresh | TrueNAS Tech Talk (T3) E049

On today’s TrueNAS Tech Talk, Chris and Kris take on some SSD truisms and myths - does old data on SSDs “go stale” and actually read slower than freshly written? It does, and they’ll tell you how to help prevent both performance and data loss on your TrueNAS SSD pool - the best part is, you’re probably already doing it! We’ll also tidy up some of the questions around TRIM, why autotrim is off by default, and where you might find some pain points. Plus, we’re looking for your insight, comments, and suggestions on how to further enhance TrueNAS to help mitigate some of the issues of the component price crisis - from the mundane to the madhouse, we want to hear your ideas - even if you want to use a USB boot device again!

4 Likes

Having a fridge insult you is probably cheaper than a wegovy prescription for weight loss.

1 Like

While true tiered storage is always at the top of the list for improvement ideas…..

Here’s an idea; What if there was a way to facilitate the detachment of a special meta-data VDEV without losing the pool. At least then, you could potentially repurpose your SSD drives for other applications or be able to put them into different systems entirely.

2 Likes

for hybrid systems i’m curious about the possibility of a more steered l2arc.

Using truenas for video editing, mass raw video storage, and other backup tasks primarily i am struggling to have a more simple performance setup. Because right now i copy from hdd to ssd before working.

I have the following setup:
  • 4x 10TB SATA HDD (raidz1; mass storage)
  • 1x TTB SATA SSD (stripe; editing)
  • 2x 32GB RAM ECC (4 sticks possible up to 256)
with this Upgrade Path
  • 2x 1TB SATA SSDs (Mirror): “Special VDEV.” Holds metadata + small project files.
  • 1x 2TB NVMe (PCIe Slot 1): The “L2ARC.” Cache for 8K footage of the main pool.
  • 1x 2TB NVMe (PCIe Slot 2): The “Ingest/Scratch/Apps/VMs.”

So right now I have to copy data off of my HDDs onto my SSD for speedy editing. i could reassign my current 1 ssd as a l2arc but all i read is that its not all that useful?!

So i would love it if i could simply designate a dataset on my HDD to have priority in an L2ARC. So say 80% of my l2arc ssd is filled with recently accessed footage even though that server is hit more often with small files from other datasets.

ARC in general is a bit too inaccessible for me and with the way I use the machine a bit random. I have editing sprints and then weeks w/o editing.

2 Likes

I wonder why TrueNAS doesn’t have scheduled TRIM in its UI? I have it as a manual command to run weekly, but would be nice to have that automated ootb for new users.

3 Likes

It’s a good idea. This guy and this guy agree with you.

3 Likes

Thanks for this episode it cleared a lot of things up for me and filled in some information about TRIM.

Some follow up questions on TRIM

  1. Does TRIMing cause drive wear?
  2. Can you confirm as a rough rule of thumb that:
    • TRIM should not be enabled for SSDs because unless they are of a high enough SATA spec they might not work as expected.
    • TRIM is safe to enable for NVMe drives
    • TRIM is not applicable for HDD/Rust/Spinning drives.
  3. Is UNMAP and DISCARD now fully supported for VMs that have TRIM internally enabled and does this information get passed upstream for both the VirtIO and AHCI driver?
  4. In the podcast you mention that you scheduled your TRIM to run on Sundays when things were quiet, but all I have in TrueNAS is the option of enabling Auto TRIM , there is not a schedule like there is for Scrub. Is this because I have not yet enabled Auto TRIM or is this functionality missing?
  5. Does TrueNAS monitor SSD temperature? Does TrueNAS send an email to the admin is dangerous temps are reached? Keeping an SSD cool is good for extending it’s lifetime.

I think that the information in this podcast should be added to the documentation, especially the TRIM information as it is lacking.

Thanks

1 Like

Not really, and definitely not if it’s done once per day or week.

Any modern SSD of a reputable brand is more likely to benefit with regular TRIMs than none at all.

Yes.

For SMR drives only. You shouldn’t be using SMR in your NAS anyways.

Way above my pea brain. Hopefully someone else can answer.

See my post above and @tannisroot’s post. I provided links in my post.

A TRIM finishes within minutes or seconds. Usually it happens so fast that by the time you check on the status, it’s already done.

Here is my Cron Job command that runs on Sundays at 3 AM:

zpool trim ssd-pool; zpool trim boot-pool

Depends on the SSD/NVMe and which sensor it is monitoring. In my case with an older NVMe (on Core), the sensor it used was not indicative of the important temperature. (It had 2 temp sensors.) That’s why I needed to manually check my NVMe temps with the command-line. My replacement NVMe drives are fine and use the correct sensor.

2 Likes

Does the boot pool only need to be TRIMmed once after a new pool is installed? I did not think any data was written to it.

It doesn’t hurt. It’s only once a week, some people have their System Dataset on the boot-pool, and with recent versions of SCALE/CE, the syslog must write to the boot-pool, no matter where the System Dataset lives. (On Core and older versions of SCALE, we had the choice of putting the System Dataset on a separate pool and the syslog could live there too.[1] Not anymore.) :pensive_face:


  1. This meant that we used to be able to store the syslog in an encrypted space. That’s not possible anymore. (I still can, since I’m on Core.) :smirking_face: ↩︎

2 Likes

TrueNAS 26 should be interesting for you then. :wink: It’s not “fully” tiered but this kind of feature takes time and refinement. Check out last week’s episode for a bit more on that (w/timestamp):

https://www.youtube.com/watch?v=G_91SaFnpRY&t=137s

Already possible, the constraining factor is the existence of RAIDZ vdevs. If they’re present, you can’t detach any top-level vdev, and special has to play by the same rules. Going to be adding some more warnings about this hopefully.

Good news @tangerine - you can accomplish some of this now and will be able to do more of/a variant on it in TN26. The knobs aren’t in the webUI but you can selectively tune the L2ARC fill behavior on a dataset level with sudo zfs set secondarycache=all|metadata|none pool/dataset from the shell. :slight_smile:

This tells ZFS what should be allowed to go into L2ARC from a given dataset - all for everything, metadata for just those indices/checksums/etc, or none if you want nothing at all (not recommended) - so you could tweak this for your datasets that you want to stay out.

In TN26 we’ll be adding knobs on a per-dataset level to tell it “everything here should live in the special vdev” - but that might require a rewrite. If you’re talking about a periodic scan to promote/demote data autonomously … well … “stay tuned”? :wink:

2 Likes

This guy too. :wink: I’ll pester at the roadmap meeting.

3 Likes

Yes, provided the drive in question returns it through a SMART query.

I’ll work on bulking this up with our Docs team. Thanks for inspiring this podcast; I really enjoy doing the deep dives. :slight_smile:

1 Like

I enjoyed this technical dive podcast, it was relevant and helps a lot of newbies.

And can you add notes about if trim is passed through with the VM drivers.:grinning_face:

My bad, missed that part!

I don’t think we ever undid NAS-125642 in the backend so discard should be present. IIRC it requires sparse zvols though so it might not work if you create the disk as part of the “new VM” flow - you have to make your sparse zvol first.

Not sure about the detect-zeroes to enable this for guest OS without proper discard though. TODO: Monday :slight_smile:

1 Like

One thing that could make L2ARC more useful, is an auto-populate command. Right now, a file, like a video file, won’t populate into L2ARC until it is both read often enough, and yet falls out of the L1ARC, (aka ARC / RAM).

In theory, you could simulate some of this by simply reading the file multiple times. But, they will likely just put the file into ARC.

Using the secondary cache parameter per dataset does help. Yet it does not help completely.


The best tiered storage solution I have seen is SAM-FS / SAM-QFS, (when I worked at Sun Microsystems). Data would automatically move off the lower performance tier storage, (tape in our case), to higher performance disk when the file was needed.

We set up some lower use file systems with SATA drives on the front end and high capacity, (but poor seek time), tapes on the back end. We had TBs of data in these file systems, (for code history). (And yes, TBs back in 2008 was a huge amount of data…)

However, for high performance needs, we used higher RPM Fibre Channel disks on the front end, and lower capacity but fast seek time tape on the back end.

While a bit of a pain to setup SAM-FS, it worked and worked well. Even backups were weird, as you could specify multiple tape back ends so a single tape loss would not cause data loss.

2 Likes

are we talking .04 here? :speak_no_evil_monkey:

there is some interesting thing in that ...set secondarycache choice. why add a special vdev if i could (also) store all metadata on an l2arc? seems to me like an obvious thing to avoid the problem with the metadata vdev failing while having the same speed advantage of the special vdev.

if that is actually doing what i am reading why isn’t that vdev more promoted (and configurable; as you hint) to enable a higher speed?

if i could i would install a huge l2arc (4 maybe even 8TB), put a copy of all metadata onto it (with the primary copy staying on my raidz1 hdd), and also designate a 16TB media dataset as the primary beneficiary of that l2arc. and then have that ssd report an 80 to 90% fill level most days.

It is not quite that simple. L2ARC only populates on eviction from ARC, (aka L1ARC / RAM). And only if it seems useful to keep around. Their are some tunables related to this, including ones that improves L2ARC functionality.

Next, L2ARC requires entries in RAM for pointers. So an 8TB L2ARC SSD might not be good if you have more limited memory, like 16GBs. Having too many L2ARC data or metadata entries, simply eats up RAM. (One RAM pointer to each ZFS block stored in L2ARC…)

A newish feature of L2ARC does allow ZFS block entries to be left in their compressed state. That saves space on the L2ARC, when the data is compressible.

Another newer feature, (several years old by now), is for L2ARC to persist across boots / pool exports. That helps, yet creates a pool import delay because the RAM pointers need to be re-created from the persistent L2ARC entries.

I’ve written something up about L2ARC. It’s not perfect, but if anyone finds something useful to add, or mistake to correct, I will certainly update the Resource.

3 Likes

boot delays don’t matter in my book. whats 60 seconds more or less in a boot sequence that i do maybe once every 3 to 6 months (for updates).

that l2arc only populates on arc eviction is something that sounds solvable on an openzfs level.

one can give recommendations of ram to l2arc sizes (in the interface). also putting a copy of all metadata into the l2arc could just work similar to the special vdev. why keep these pointers around in ram when you can tell the processes to always go look in there. i assume that we don’t need pointers in ram for metadata in the special vdev.

compression should be kept on in that l2arc.

it didn’t persist on persistent storage? whow

i think a lot of new people expect my “design” to be whats actually happening in l2arc but get disappointed when it doesn’t behave this way.

like arc in general is often seems a bit useless when working with video files. why is it not keeping the files inside its cache when it clearly gets scrubbed etc.? i know it tries to be smart but that behavior is why i copy stuff over from the hdd to the ssd for editing manually (side note: my mac copies it through the network–not that that is the bottleneck?!)

an empty l2arc is about as useful as empty ram. it should always be near full and persist through boots.