Installation failed on eMMC Odroid H4+

Hi

I’m trying to install TrueNAS EE on Odroid H4+ on eMMC (256GB).
This is a fresh installation - new hardware.
I’m getting error:

Failed to find partition number 2 on mmcblk0

Anyone know how to proceed with TrueNAS installation?

2 Likes

Hey @mykindofit

I’ll have to go digging for the other time I saw this happen, but if I recall correctly the mmcblkN devices on the ODROID hardware present the same unique identifiers.

Are you getting this error during install, or post-install on first boot?

This is the error I’m getting during the install.

Have you found solution to this?
I am having the same issue.

1 Like

I had the same issue with 24.10. I downloaded 24.04 and it installed. I upgraded the the latest 24.04 and then to 24.10. So far so good.

1 Like

Thanks God found this thread, had the same issue, did the same sequence installed Dragonfish version first 24.04 without error message and later updated to latest 24.10

Sorry for reviving this topic, but I wanted to clarify which version of 24 you used to install on the eMMC drive. Since 24.04 has several variants, do you think using 24.04.2.5 will work, or should I opt for 24.04.0 instead?

Also if you don’t mind, how big is the emmc module and did you created a swap partition when installing TN, I’m thinking in getting one pretty soon

Stumbled on the same error during installation on TrueNAS 25.04 on GMKTec NucBox G9 NAS.
It turned out that the error was due to the boot partitions mmcblk0boot0 and mmcblk0boot1 switched into read-only mode.
I fixed the issue by following the recipe in kernel org note mmc-dev-parts.txt

In the boot menu select shell

List your block devices

$ lsblk
NAME         MAJ:MIN RM  SIZE RO TYPE MOUNTPOINTS
...
mmcblk0      179:0    0 58.2G  0 disk 
├─mmcblk0p1  179:1    0    1M  0 part 
├─mmcblk0p2  179:2    0  512M  0 part 
└─mmcblk0p3  179:3    0 57.7G  0 part 
mmcblk0boot0 179:256  0    4M  1 disk 
mmcblk0boot1 179:512  0    4M  1 disk 
...

Disable read-only access

$ echo 0 > /sys/block/mmcblk0boot0/force_ro
$ echo 0 > /sys/block/mmcblk0boot1/force_ro

then restart and run the installation once again.

1 Like

I tried the above and did not work for me using the same hardware and software. Also tried ElectricEel-24.10.2.1 which eventually worked by repeating the install process (and subsequent failures) for 10 or 11 attempts. Found this links which assisted in the eventual success:

I tried all the “workarounds” listed, but after a few attempts on #1, didn’t think much of it… After many attempts #2 gave up and went to #3 and still nothing. Something in my brain said try #1 again MORE than 10 times and the it worked - go figure…

FYI in workaround #2, you need to remove certain SPACES from the command:
Was:

sgdisk -a 4096 -n 1:0:+1024K   -t 1:EF02 -A 1:set:2 /dev/mmcblk0

Should be:

sgdisk -a4096 -n1:0:+1024K   -t1:EF02 -A1:set:2 /dev/mmcblk0

soz - could not find a way to advise the OP of the link through the Atlassian site…

1 Like

TL;DR: Look at the “Workaround” section at the bottom.

I think I have figured out the cause of the problem and a reliable workaround thanks to the previous research by

  • Oleksandr Liutyi, author of
  • @seekee (with finding the post of Liutyi), and
  • @dameb10129 (with bringing my focus to the /sys/block system).

Issue

I am also running an Odroid H4+ with an eMMC (64GB) and encountered the exact same error behaviour as described by OP while trying to install TrueNAS SCALE 25.04.0

Retrying ~30 times in total and the other workarounds suggested by dameb and Liutyi did not help for me.

Investigation

During my investigation I primarily looked at the python snippet identified by Liutyi (format_disk function at truenas-installer/truenas_installer/install.py at TS-25.04.0 · truenas/truenas-installer · GitHub) and the function get_partitions called from this code site (truenas-installer/truenas_installer/utils.py at TS-25.04.0 · truenas/truenas-installer · GitHub).

From what I can tell it’s a combination of the long time it takes the new partitions to show up in /dev (which I have not observed directly but rather inferred from my knowledge of the code and the system behaviour during my debugging attempts) and the MMC boot partitions (special to eMMC devices) that evidently have not been anticipated during the implementation of the installer.

So the format_disk function looks fine - it relies on the get_partitions function to do it’s job and wait for up to 30s to find the correct partition devices - neither of which it accomplishes in this case.

The issues seem to lie in the main loop of the get_partitions function:

for _try in range(tries):
    if all((disk_partitions[i] is not None for i in disk_partitions)):
        # all partitions were found on disk
        return disk_partitions

    try:
        with os.scandir(f"/sys/block/{device}") as dir_contents:
            for partdir in filter(lambda x: x.is_dir() and x.name.startswith(device), dir_contents):
                with open(os.path.join(partdir.path, 'partition')) as f:
                    try:
                        _part = int(f.read().strip())
                        if _part in partitions:
                            # looks like {1: '/dev/sda1', 2: '/dev/nvme0n1p2'}
                            disk_partitions[_part] = f'/dev/{partdir.name}'
                    except (OSError, ValueError):
                        # OSError: [Errno 19] No such device was seen on
                        # our internal CI/CD infrastructure for reasons
                        # not understood...
                        continue
    except FileNotFoundError:
        continue

    await asyncio.sleep(1)
A short context:
  • tries is the number of retries i.e. 30
  • disk_partitions is a dictionary that is initialized as {1: None, 2: None, 3: None} in our case
    it’s supposed to end up as {1: "/dev/mmcblk0p1", 2: "/dev/mmcblk0p2", 3: "/dev/mmcblk0p3"} in our case
  • device is a string - the base device name, in our case mmcblk0
  • partitions is simply a numeric list of the partitions we are looking for - basically the keys of disk_partitions, i.e. [1,2,3]

Problems in the loop

The problem here is that for eMMCs with boot partitions there are /sys/block/mmcblkX/mmcblkXbootZ paths in addition and independent from the normal, expected /sys/block/mmcblkX/mmcblkXpY paths.
These do not have a partition file in them as the inner loop that iterates all /sys/block/mmcblkX/mmcblkX* expects!
Unfortunately the open for trying to read the /sys/.../partition file is not protected by the inner try-catch which would only skip the current /sys/block/mmcblkX/mmcblkXbootZ and eventually reach the .../mmcblkXpY paths it expects.
Instead, it is only protected by the outer try-catch which is probably intended as a catch-all in case /sys/block/mmcblkX is completely missing for whatever reason, since it rapidly completes the outer loop by skipping the await asyncio.sleep(1) statement.

So it’s semi-random, dependent on the order that os.scandir returns the directory entries (favorable case: all mmcblkXpY first, all mmcblkXbootZ later), whether the loop completes it’s mission successfully.

The fallback can’t help

There is actually backup code path should the loop fail. This one scans the /dev directory instead of looking in /sys/....
Though from analyzing the code it seems that this one would also get confused by the /dev/mmcblkXbootZ entries and even misidentify them as the /dev/mmcblkXpY it’s actually looking for, in case Z and Y are the same, which is the case on my system for boot1 and p1.
Also due to the missing sleep statements in the prior loop it seems that /dev is not populated by the time this code path is reached.

Conclusion

To resolve this issue permanently, changes to the installer seem to be unavoidable.

Fixing the installer

Since I am not familiar with the code base and my knowledge of Linux internals is very limited, my suggestions will stay vague.

A fix might include:

  • Making sure that the sleep interval between retries is respected even when exceptions occur
  • Making sure that all the paths /sys/block/{device_name}/{device_name}* are traversed even if some do not behave as expected
  • Potentially: Explicit handling for eMMC by skipping boot partitions (and maybe other device partitions if they exist)

Workaround

For now I have found a workaround that allowed me to install TrueNAS successfully:

  1. Open the Shell in the installer

  2. Patch /lib/python3/dist-packages/truenas_installer/utils.py by:

    moving the line await asyncio.sleep(1) right beneath for _try in range(tries):

  3. Hide the /dev/mmcblk0boot* devices that confuse the “last resort” path of the installer by:

    running rm -f /dev/mmcblk0boot0 and rm -f /dev/mmcblk0boot1

  4. exit the shell and run the installer as normal (without rebooting in between)

1 Like