CORE to SCALE upgrade failed

Hi,
had some upgrades going fine but now already on two machines got this:

middlewared.service_exception.CallError: [EFAULT] Command ['grub-mkconfig', '-o', '/boot/grub/grub.cfg'] failed with exit code 1: /usr/local/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?).

Both servers are HPE Proliant machines, could that be an issue?
Upgrading from CORE 13.0-U6.2 to SCALE 24.04.2.3.

I found this old topic, so looks like I am not the only one with this problem, but it goes nowhere.

Thanks for looking into this.

Full trace

[2024/10/11 14:58:44] (ERROR) middlewared.job.run():367 - Job <bound method accepts.<locals>.wrap.<locals>.nf of <middlewared.plugins.update.UpdateService object at 0x819d9e2b0>> failed
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 355, in run
    await self.future
  File "/usr/local/lib/python3.9/site-packages/middlewared/job.py", line 391, in __run_body
    rv = await self.method(*([self] + args))
  File "/usr/local/lib/python3.9/site-packages/middlewared/schema.py", line 981, in nf
    return await f(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/update.py", line 389, in file
    await self.middleware.call('update.install_manual_impl', job, destfile, dest_extracted)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1283, in call
    return await self._call(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1251, in _call
    return await self.run_in_executor(prepared_call.executor, methodobj, *prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1156, in run_in_executor
    return await loop.run_in_executor(pool, functools.partial(method, *args, **kwargs))
  File "/usr/local/lib/python3.9/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/update_/install_freebsd.py", line 66, in install_manual_impl
    return self._install_scale(job, path)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/update_/install_freebsd.py", line 95, in _install_scale
    return self.middleware.call_sync(
  File "/usr/local/lib/python3.9/site-packages/middlewared/main.py", line 1310, in call_sync
    return methodobj(*prepared_call.args)
  File "/usr/local/lib/python3.9/site-packages/middlewared/plugins/update_/install.py", line 69, in install_scale
    raise CallError(error)
middlewared.service_exception.CallError: [EFAULT] Command ['grub-mkconfig', '-o', '/boot/grub/grub.cfg'] failed with exit code 1: /usr/local/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?).

So you’ve upgraded other machines using the same process?
What were the other machines vs the ones with problem?

Can you document the process used…
I assume no jails or VMs?

This is hardly a process at all…

It goes through several upgrade stages

  • uploading
  • extracting (short)
  • verifying
  • creating dataset
  • extracting again (long)
    Screenshot 2024-10-11 at 16.06.45
  • performing post-install tasks (75%)
  • failing with message
    # [EFAULT] Command ['grub-mkconfig', '-o', '/boot/grub/grub.cfg'] failed with exit code 1: /usr/local/sbin/grub-probe: error: cannot find a device for / (is /dev mounted?).

No jails, no VMs.
System is booted from USB flash. I know that’s an inferior choice but it had not been an obstacle before and should work.

While I’ve seen oddness from HPE and other vendors, it usually manifests in a “we refuse to boot from a non-Windows EFI blob” on consumer machines. ProLiant servers tend to be more of a “we’re using a RAID controller, and it doesn’t like to revert to HBA mode.”

We’ve been recommending against USB flash (“thumbdrives”) as a boot device for some time due to their lower endurance. It’s entirely possible your drive has worn out at this point.

In your scenario I would back up your configuration before performing any further troubleshooting or attempts to upgrade - that way in case your boot media has failed, you can easily restore onto a new one.

1 Like

I’d recommend backing up your config and trying a fresh install first. I don’t know when your CORE was originally installed, but it may be something from antiquity that needs a clean load to resolve.