Trying to figure out the best strategy for Dataset and home folder

stevehammond · April 28, 2024, 4:50pm

Hello all,

I’ve been posting a bit recently in this forum, and I had great help so far. Like I mentioned, I’m coming from a Synology DiskStation, and I’m trying to figure out the best way to implement Datasets, which is something that has no equivalent on Synology. The Storage Pool is equivalent to the Volumes, and shares are shares, but there is nothing that compares to datasets… So before jumping into it, I want to make sure I adopt the right strategy. My TrueNAS server will be for data sharing, sure, but also usage of apps from TrueNAS or TrueCharts.

1- I saw a video tutorial, can’t remember which one, but the person was suggesting to create a dataset specifically for apps storage, as it will be easily share with the apps group and may prevent issues. Makes sense to me, so I already plan one for this.

2- I think in the same video, it was suggested that when installing an app, to put the app configs on a different dataset than the one created for apps (ix-applications), arguing that transferring apps to a new system may be easier. I’m really not sure about that one.

3- On Synology, all users had a home folder. Not sure if I really need that, as I may change my storage strategy on TrueNAS, but still, if I decide to go that way, is there any concerns or suggestion on datasets?

4- also, if users don’t have home folders, and I SSH to the account, where are stored the default SSH keys and login scripts (.profile or such depending on the shell)?

So far, I was planning the following:

A dataset for apps, with multiple folders as below:

A Media folder, where I will put my videos and music
A Picture folder for my images and photos
I may or not dedicate a folder for apps preferences

A second dataset for data archival

Home folder will be here if I decide to do so
I’ll probably create folders for each type or archival I want to do

In all of that, each folders will be a shared by itself, so it should be easy to manage access depending on the needs.

I’m sure I’ll get real good advise here again, so thank you for your help in advance.

Loransea · May 12, 2024, 9:15pm

Hi Steve,
I have the same issue/question concerning datasets.
I think I’ve watched the same video
I tried like this blog suggested to have a dataset per se for user A, B and C, one public and one for home users media but ended with invisible children datasets of public dataset

Hence I’ve deleted my pool and started again from sgratch but I’m still wondering what would be best solution for my NAS usage?
1 torrenting safely
2 data storage for family members (music, photos, videos and documents) accesible from local network and from school or work and with different ACL permission depending on the sensitivity of the data

I’d aprecilove any suggestion. I joint my ideas.

2x10Tb mirror on a Lenovo M83 core i3 with SSD boot and 8Gib RAM running Scale 23.10

Loransea · June 4, 2024, 5:22pm

Hi Steve,

Our issues don’t seam to rally many around

Did you find the answers you were looking for?

I’ve set up both designs

Comb like pros : datasets are unique «containers»
Comb like cons : anyone accessing the network will see all datasets. Am I right?

Tree like pros : personnal datasets are invisible fer user that do not have access to the parent dataset.
Tree like cons : matryoshka puppet style datasets. would daughters dataset still be accessible if parent fails?

I’m also wondering who would/should be owner to each dataset?

stevehammond · June 7, 2024, 1:58am

Hi Loransea

You’re right, it is not a subject that fascinated the planet for sure. I tried doing a few datasets like I proposed in my first message, and went into some issues that I could not resolved. Don’t remember exactly what. I’ve talked to some friends who said, there is not much benefits to doing multiple datasets. I finally ended up creating only one dataset for all my data, and moved everything there for now. I don’t say I’ll never try to do multiple datasets eventually for my data, but for now, it fits the bill.

I only created another set of datasets as I wanted to install NextCloud and that is what was suggested in the tutorial.

Stux · June 7, 2024, 2:21am

Sharing datasets that have children leads to issues.

Sharing datasets that are children does not.

I use generic datasets to collect other datasets, but I share the childless ones.

Ie, I might have a bunch of docker related datasets in a ‘docker’ dataset. And the might snapshot recursively the docker dataset…

sfatula · June 7, 2024, 5:35am

There are endless strategies for datasets, no right answer. For me, I basically have 3 datasets:

DATA
ARCHIVE
NOBACKUP

Underneath those are directories (not datasets), you can share those directories. I organized it this way to make backups trivial as backups are the most important part of the NAS other than the data to me. NOBACKUP meant a good part of what I have and I don’t snapshot or back those up at all, it’s things like transcoding files, temp files, handbrake work space, etc. ARCHIVE means stuff that rarely changes once written. DATA means stuff that changes a lot. Actually, I have a virtual machine pool also on SSD.

I found backups were much more complicated otherwise as I was backing up stuff I really didn’t want to, etc. With 21 apps now, it’s just too complicated to organize by application for me at least. Simple example, Emby. I have recordings from a TV antenna that I do not want to backup and that change daily. I have a library of permanent content. And then there is the emby database and other more dynamic stuff. So, I use all 3 datasets as I don’t back up TV recordings since I could find a way to get them, I backup using one method for the media library. And I definitely snapshot and otherwise backup the Emby database and related data. 3 different backup (or lack thereof) techniques. That would be very hard to do if simply organized by app.

Note I am generally different than most here on the ways I do things. So, it may or may not make much sense to anyone else.

Bajan · November 28, 2024, 4:20am

@sfatula
How do you create ‘directories’ (not datasets) with TrueNAS Scale, under your datasets ?
DATA
ARCHIVE
NOBACKUP

SmallBarky · November 28, 2024, 4:30am

You go into the dataset and run the ‘mkdir’ command. You can do an internet search for the options or run ‘man mkdir’ on the command line in BSD or Linux

sfatula · November 28, 2024, 6:53am

SmallBarky is correct, so I use the command line, which is definitely not for everyone. However, to me, it makes things so much simpler, for backup reasons as noted, for simplicity as I don’t have the problem thousands of others do (well more speaking of posts) with ACLS (and related bugs at times) or other such common techniques. But nothing would stop you from putting datasets all over the place instead of my subdirectories either if that’s what you wanted to do. Then you have to deal with lots of shares and the problems that may cause. The question from the OP was what is the BEST strategy, and I don’t think there is a universal best strategy.

Note I am not suggesting this is the way. I was merely posting a way, and as I said in my post, I do things different than the mob. I just find it much simpler than other methods. But, I am also a retired admin amongst many other jobs.

Bajan · December 14, 2024, 6:53pm

So, If those would be Terminal mkdir directories (Instead of additional Datasets), I would guess that if logged into Truenas Scale under the same user (with admin All permissions) as I use on my Single Share (Me only) SMB connections, that if I just make the directories while connected to the SMB in Windows, This would be the same? I understand that how/who/what user I am logged/signed in on for SMB would/could make a difference as far as owner and group are concerned as compared to the same action under user logged in on Truenas when using Terminal to make/create same directories…?

sfatula · December 14, 2024, 10:00pm

I’m not sure what you are trying to accomplish. Yes, if I make directories under cli, it’s owned by whatever user and group I am using. Of course, I can chown and chmod them and make them whatever I want. If I made a share for each hostpath, I’d have something like 70 shares and datasets. I don’t want that, what a pain! I do have several different users, security is merely controlled by chown and chmod (if needed), not ACLs. It’s the old simplistic method.

If you create a dir via SMB, it’s based on your permissions and logged in user, yes. If you create a dir via SMB into one of my subdirs, it’s still controlled by the same.

My use is much simpler as it’s a home based system, so, only a few users needed and no ACLs makes it even simpler. When I add a new app, I create the subdirs where I want them, make the ownerships what the app needs, and it works, every time.

allanonmage · May 13, 2025, 1:22am

A few things I’ve realized when looking into datasets vs folders(aka directories).

The two are very similar in concept, but very different on the backend.
They both work like folders in that they can exist on the same storage.
They differ in that changing things later may be tricky, or not work at all. For example, I came across a redditor asking how to downgrade a dataset into a folder and apparently that can’t be done.
Copying files from 1 dataset to another is not like copying files form 1 folder to another, it’s like copying from 1 drive to another.
TrueNAS can’t have permissions on shares or folders, TrueNAS must do complex permissions on datasets, or by having multiple datasets. Having a user that has read access to everything within a dataset, and write access to only a single folder in that same dataset is too complex. That single folder with the write access must be a new dataset, not just a folder. This was my problem I had to solve. It’s trivial to set up file sharing like this in Windows, nigh impossible in Linux, and can’t be done via the TrueNAS GUI.
The datasets will have to have their own unique files, taking up their own space, which makes some of the graphs above space hogs, depending on what files actually go where.