Let me tell you a story to make sure I understand something.
I created a pool and this pool of course started with a single dataset. I was happy with this. I set up a single SMB share against this single dataset and configured it as a multi-user time machine. (“Time Machine” is the name of the backup software provided by the other computers.)
I created a user A. I performed a backup of computer A. I looked at the dataset, and sure enough it had grown by a plausible amount.
I created user B. I performed a backup of computer B. I looked at the dataset, and sure enough it had grown by a plausible amount.
However, there was now also a child dataset called B. I didn’t explicitly ask for this dataset. Its existence was not the end of the world, but it made me think I must have made some kind of mistake that I had better understand before risking making things worse.
After sniffing around, I discovered I had given user A a home directory and I had specified a “home directory” for user B of /nonexistent
. Why did I do something different for A and B? Who knows?
I deleted the child dataset and gave user B a home directory alongside that of user A. I then started another backup of computer B. It hasn’t yet finished, but I can see its files accumulating in the home directory and there is no child dataset.
It seems to me likely that the child dataset for B got created in the absence of a home directory for B. And that makes a certain amount of sense, I guess, given that I asked for a multi-user time machine.
My “mistake”, if you can call it that, was in giving A a home directory. Had I not done that, I imagine both A and B would have gotten child datasets, and not only would I have been able to manipulate them separately at that level of abstraction but the fact that there was a dataset per user would have induced me to assume everything was working as expected.
Now, in terms of datasets, I have a single big soup, which is what I originally intended because I didn’t know any better, so I’m not annoyed.
But it occurs to me that in future I might want to do things to each user’s data at the dataset level. I mean, in practice, probably not — the users are just me and my wife. But in theory the big soup I have right now may not have been the best way to go.
Make sense so far? If so, what documentation can I consult in order to understand better the relationship between datasets and home directories and directories in general?