Allow utf8only, normalization to be set on zpool create

Problem/Justification
Ensure filenames are UTF-8 correct to avoid issues when accessing cross-platform. I found a situation where I had 2 files that somehow had filenames that should be UTF-8 but were not, and could not select them over file share even by cut-and-pasting the filename. Supports internationalisation and emojification.

Impact
I’m not sure why one wouldn’t set utf8only and normalization formd by default, but in any case I have not found that doing so causes any issues. The number of people who intentionally create non-UTF8 character filenames must be much less than the number who accidentally have issues.

User Story
Ideally it just happens, but otherwise as a selection when creating a pool or a dataset. The alternative is the user creates the pool by hand and then hands over dataset creation to TrueNAS, which is oblivious to utf8only and normalization setting.

Setting normalization=formD ensures that clients with two different normalisation schemes (for example, Mac accessing a share via NFS and Windows accessing via SMB) can correctly open each other’s files and cannot create duplicate filenames that look the same but have different normalisations/different bytes.

If one is sure that this can’t happen (for example an apps dataset, or a SMB-only share because SMB always uses NFC), then normalization=none can be used. Otherwise, normalization=formD is safer, at the cost of normalization calculations.

This would be a great quality-of-life improvement. Setting utf8only and normalization at zpool create time would streamline deployments—especially in multi-user environments with NFS or SMB shares. Right now, having to script dataset creation afterward feels clunky. I support adding this as a CLI flag or at least exposing it in the API for automation purposes. Makes perfect sense for new installs.

1 Like

What I do is let TrueNAS create the pool, then on the command line zpool status to get the disk IDs, and then destroy the pool and recreate it with a script. All to get -O utf8only=on…