Corrupted disk, ix-apps dataset lost, how to reset Apps?

Hello, I’m reporposing an old HP Proliant microserver with truenas.

SW: ElectricEel-24.10.0.2
HW: AMD Athlon™ II Neo N36L Dual-Core Processor 8GB RAM
I have SSD for boot-pool and 3 HDD for data.

The HDDs are old, and I’ve already ordered new ones. But I created a test zraid1 named main and started an App.

I forgot that this HW requires special kernel boot parameters otherwise it generates kernel panics, and so my system froze.

On reboot 2 of the 3 disks where corrupted and I wiped them.

Now my Apps are stuck because they want the old dataset also for unsetting it or forgetting about the old storage pool.

If I go to the apps tab I have and Error in Apps service message with the following overlay.

Application(s) have failed to start: Failed to umount 'main/ix-apps': [EFAULT] Failed to umount dataset: Dataset main/ix-apps not found

If I click “Configureation → Unset pool” I get this error [EFAULT] Failed to stop docker service: [EFAULT] Failed to umount dataset: Dataset main/ix-apps not found

with the following stack trace:

concurrent.futures.process._RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/plugins/zfs_/dataset_actions.py", line 77, in umount
    with libzfs.ZFS() as zfs:
  File "libzfs.pyx", line 534, in libzfs.ZFS.__exit__
  File "/usr/lib/python3/dist-packages/middlewared/plugins/zfs_/dataset_actions.py", line 78, in umount
    dataset = zfs.get_dataset(name)
              ^^^^^^^^^^^^^^^^^^^^^
  File "libzfs.pyx", line 1463, in libzfs.ZFS.get_dataset
libzfs.ZFSException: Dataset main/ix-apps not found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3.11/concurrent/futures/process.py", line 256, in _process_worker
    r = call_item.fn(*call_item.args, **call_item.kwargs)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 112, in main_worker
    res = MIDDLEWARE._run(*call_args)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 46, in _run
    return self._call(name, serviceobj, methodobj, args, job=job)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 34, in _call
    with Client(f'ws+unix://{MIDDLEWARE_RUN_DIR}/middlewared-internal.sock', py_exceptions=True) as c:
  File "/usr/lib/python3/dist-packages/middlewared/worker.py", line 40, in _call
    return methodobj(*params)
           ^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 183, in nf
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/zfs_/dataset_actions.py", line 82, in umount
    raise CallError(f'Failed to umount dataset: {e}')
middlewared.service_exception.CallError: [EFAULT] Failed to umount dataset: Dataset main/ix-apps not found
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/plugins/docker/update.py", line 90, in do_update
    await self.middleware.call('service.stop', 'docker')
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1626, in call
    return await self._call(
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1457, in _call
    return await methodobj(*prepared_call.args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 179, in nf
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 49, in nf
    res = await f(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/service.py", line 267, in stop
    await service_object.after_stop()
  File "/usr/lib/python3/dist-packages/middlewared/plugins/service_/services/docker.py", line 72, in after_stop
    await self.mount_umount_ix_apps(False)
  File "/usr/lib/python3/dist-packages/middlewared/plugins/service_/services/docker.py", line 19, in mount_umount_ix_apps
    await self.middleware.call('zfs.dataset.umount', docker_ds, {'force': True})
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1626, in call
    return await self._call(
           ^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1465, in _call
    return await self._call_worker(name, *prepared_call.args)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1471, in _call_worker
    return await self.run_in_proc(main_worker, name, args, job)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1377, in run_in_proc
    return await self.run_in_executor(self.__procpool, method, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/main.py", line 1361, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
middlewared.service_exception.CallError: [EFAULT] Failed to umount dataset: Dataset main/ix-apps not found

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/middlewared/job.py", line 488, in run
    await self.future
  File "/usr/lib/python3/dist-packages/middlewared/job.py", line 533, in __run_body
    rv = await self.method(*args)
         ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 49, in nf
    res = await f(*args, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/schema/processor.py", line 179, in nf
    return await func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3/dist-packages/middlewared/plugins/docker/update.py", line 92, in do_update
    raise CallError(f'Failed to stop docker service: {e}')
middlewared.service_exception.CallError: [EFAULT] Failed to stop docker service: [EFAULT] Failed to umount dataset: Dataset main/ix-apps not found

Is there a way I can reset the Apps entirely? There is nothing I can recover, and the data was not even worth an attempt.

Thank you

  1. Get yourself a 2nd SSD to use as an Apps pool.

  2. There is no reason that 2 of 3 disks should be corrupt due to an O/S crash. ZFS is specifically designed to survive such events and not get corrupted. What is your storage controller i.e. MB SATA ports or which HBA?

  3. If you get a problem importing a pool on start-up, wiping 2 of the 3 disks is only going to make it impossible to get it working again.

  4. This pool is toast, and I suspect it will be far easier to reinstall TrueNAS than to sort out the mess.

Thank you Protopia.

I don’t care about recovering data, the zpool import showed 2 faulted disk out of 3 and pointed me to Message ID: ZFS-8000-5E — OpenZFS documentation.

Because I only had 3 files on the NAS I decided to wipe and recreate. But after reading the drive SMART data I decided to remove 2 drives and buy a new set (one has 15 bad sectors and 11years of power on state).

  1. Get yourself a 2nd SSD to use as an Apps pool.

How does this solve my problem? If I create a an Apps pool from a new drive, and this for some reason gets corrupted (single drive, no redundancy) I’ll end up in the same situation that I have today.

  1. There is no reason that 2 of 3 disks should be corrupt due to an O/S crash. ZFS is specifically designed to survive such events and not get corrupted. What is your storage controller i.e. MB SATA ports or which HBA?

The MB has a 4 bay SATA enclosure. It has a raid controller of some sort but I have it disabled in the BIOS and using it as AHCI.

As I said, some of those drives have more than 10 years, and one counted 11years of power on state.

  1. If you get a problem importing a pool on start-up, wiping 2 of the 3 disks is only going to make it impossible to get it working again.

The pool import already told me the pool was beyond hope, I had no data that I cared about on those so I went for a full wipe, the problem is that now the Apps feature (not their data) is not recoverable.

  1. This pool is toast, and I suspect it will be far easier to reinstall TrueNAS than to sort out the mess.

While I do agree on that, I think that this is a major usability issue. Does this means that if the Apps pool is lost I can only reinstall trueNAS? There is no reason the system could not recover from a blank state.

This seems a bug or a skill issue on my side (not knowing how to reset the Apps)

I believe iX agrees, and they are addressing this or at least some of it in an upcoming release.

1 Like

Thank you.

I’m not familiar with the release cycle.

Given NAS-132065 / 24.10.1 / Allow unsetting/changing pool if user accidentally nuked ix-apps in EE by sonicaj · Pull Request #14907 · truenas/middleware · GitHub is merged, is there an estimate of when this should be relased?

I guess I could try to locate that file on disk and patch my current system

Glad im not the only one that nuke his ix-apps datasets :laughing:
Btw, couldn’t wait, i have made a new fresh install

1 Like

I don’t recommend doing manual edits to system files.

If things goes like planned, it’s scheduled for release in 24.10.1, which is anticipated in less than 2 weeks.
You can see that on the right-hand side in Jira, where it says “Fix versions”.

Thank you.

I guess I’ll update or reinstall when the new HDD are delivered, depending on which happens first.

Any updates regarding this thread? I’ve also encountered this… what do I do now?

I’ve received my new HDD so I went for a reinstall.

I’m sorry

I got a new server, installed SCALE on it, imported my TrueNAS configuration, created my Pool, and it still happens… How did you reinstall to solve it?

I didn’t import the conf. I guess the apps dataset is written in the configuration

1 Like