TrueNAS CORE 12.0-U8.1 one of usb flash DEGRADED @ Boot pool

Hi All

My company running TrueNAS CORE 12.0-U8.1 @ DELL R710 server
It installed by ex-IT and he was gone.
Boot-pool have 2 USB disk (da6p2 & da7p2) installed at internal USB port @ motherboard and external USB port, but I don’t now which one installed either internal or external USB port!

Nearly, system reported CRITICAL Alerts bout boot pool.
I tried execute # zpool scrub boot-pool and got result as follows

pool: boot-pool
state: DEGRADED
status: One or more devices has experienced an unrecoverable error. An
attempt was made to correct the error. Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using ‘zpool clear’ or replace the device with ‘zpool replace’.
see: Message ID: ZFS-8000-9P — OpenZFS documentation
scan: scrub in progress since Mon Jul 8 10:00:09 2024
2.39G scanned at 3.21M/s, 1.13G issued at 1.52M/s, 2.39G total
3.61M repaired, 47.21% done, no estimated completion time
config:

NAME STATE READ WRITE CKSUM
boot-pool DEGRADED 0 0 0
mirror-0 DEGRADED 0 0 0
da7p2 ONLINE 0 0 0
da6p2 DEGRADED 0 0 66 too many errors (repairing)

da6p2 corrupted I assume.
My question…

  1. Either ONE of USB flash disk (da6p2 orda7p2) can be booting from UEFI BIOS?
  2. How to identify installed USB flash disk (which one da6p2?)
    (Note: I think about procedure as follows
    1.execute # zpool replace with DEGRADED da6p2
    2.Remove EXTERNAL USB flash disk
    3.System reboot
    4.After boot, execute # zpool status to check which one (da6p2/da7p2) remain.
    A.Is procedure OK?
    B.If remain DEGRADED da6p2, have any risk?
  3. If I purchase two NEW USB flash disk, have document/hint/induction about replace both USB disk or only DEGRADED da6p2 USB disk?
    (Replace both case, DEGRADED da6p2 first, rebuild, replace other ONLINE da7p2 ?)
  4. If I purchase two NEW USB flash disk, possible install latest TrueNAS SCALE and import existence data pool?
    Or latest TrueNAS CORE recommend?

I apologize to ask so many questions, please help.

Thanks a lot.

First make a backup of your configuration using the User Interface put the copies somewhere in case you need to start from a scratch install. You will also need to backup any encryption keys for pools. You might want to download your current TrueNAS CORE iso (12.0-U8.1) also just to be safe and have it on hand.

Post your whole configuration set up and specs of the server. It help a lot when giving advise on your situation.

You probably should just buy two new USB drives of good quality. Refer to the TrueNAS documentation for the latest CORE and SCALE version but I think you would be safe with 64Gb
https://www.truenas.com/docs/

I am asuming the boot drives are currently mirrored, please verifiy. If you figure out what drive is the bad one, just follow the documenation for replacing a drive. You can then go through the process of replacing the failing usb with a new one along with the correct procedures and the system will make them mirrored again. Repeat the same procedure to change out the older, good usb with a new one.

If you wanted to do upgrades to the CORE version or change to SCALE. Read the documentation but the usual is updating version 12 to the lastest updates in that series and then updating to version 13. Read the docs on what is required for CORE to SCALE upgrade.

2 Likes

Having read that Scale “write” a lot on boot pool, USB are not a good idea.
With Core, get a good mlc stick (32gb are enough), or use a SSD plus SATA to USB adapter

1 Like

Thank you all for the responses.

About H/W of server

  1. CPU: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz (2394.05-MHz K8-class CPU) * 2
    2.real memory = 51539607552 (49152 MB)
  2. Six SATA/SAS driver bay with HBA card, all of installed SATA 3.5in HDD with raidz2-0 for company data storage
  3. One internal USB port (1st USB flash disk attached @ boot-pool), Two external USB port (1st USB flash disk attached @ boot-pool)
  4. One SATA for Optical Drives (not ordinary connector) @ front panel but never use (Reuse plan as follows)

Combine suggestion from @SmallBarky and @oxyde , I current plan as follows.
1.Purchase two industrial grade 32GB USB flash disk as JetFlash 270M
2.Replace old USB flash disk to new.
3.Fresh reinstall latest TrueNAS CORE 13.0-U6.2 for ISO with mirror boot-pool
4.Configure network interface for WEB interface
5.Upload configure from current TrueNAS CORE 12.0-U8.1
I assume existence raidz2-0 dataset for company data storage shall be active automatically and ready for use.
Note: If encounter error as above procedure, back to old problematic USB flash and follows as below from @SmallBarky

I am asuming the boot drives are currently mirrored, please verify. If you figure out what drive is the bad one, just follow the documenation for replacing a drive. You can then go through the process of replacing the failing usb with a new one along with the correct procedures and the system will make them mirrored again. Repeat the same procedure to change out the older, good usb with a new one.

Please correct my wrong/suggestion as above plan

Next plan, replace mentioned Optical Drives with SATA SSD driver

  1. Purchase 2nd SATA Hard Drive HDD SSD Caddy Frame Tray for Dell PowerEdge R910 R710 R510 R520 R410 R310 and 32GB MLC SSD
  2. Add it into boot-pool manual.
  3. (Optional) Remove external USB flash disk for spare.

Thanks a lot.

1 Like

Until you have the backup config, and in any case the same version of the actual os ( CORE 12.0-U8.1), is possible fresh-reinstall n times without losing anything :slight_smile:
The only thing… i dont know if upload the CORE 12.0-U8.1 config into the last version can work properly. Never tried (only from minor release to other, without problem).
Only in my personal opinion: given that with the configuration backup and an ISO it is possible to quickly restore the system, that i don’t have “location” problem to physically change disk, and that in case of some hour of “down” the world dosnt end :smiley: … i opted for a simple stripe boot pool. I keep a second SSD drive (at least always have an usb stick) ready for fast install, in case something bad happen, but not more. Also, look at this if you dont know, really usefull!!!

1 Like

I would keep the same version of TrueNAS you have right now and just working on the process of replacing the failing USB, letting it resliver the new USB and then doing the process again to replace the older, working USB drive.

You can later work out if and when to update TrueNAS CORE and, if, to upgrade to SCALE then. You probably want to work on the plan for upgrading your CORE version just to get all the latest security. There was just a fix for SSH access recently.

1 Like

Hi @SmallBarky and @oxyde

Current status of boot-pool, 1st (ONLINE) device have one CKSUM error. :sweat_smile:

NAME STATE READ WRITE CKSUM
boot-pool DEGRADED 0 0 0|
mirror-0 DEGRADED 0 0 0|
da7p2 ONLINE 0 0 1|
da6p2 DEGRADED 0 0 95 too many errors|

Plan changed as follows…

  1. Replacing two bad USB flash disk to two new USB flash disk by resliver one by one
    Starting from 1st (da7p2 ONLINE)…
  2. Test two new USB flash disk boot from BIOS or not.
    If can not boot BOTH, fresh install same version of TrueNAS by ISO.
  3. Update to TrueNAS CORE 13.0-U6.2 in system (Document study require).

NAS will keep TrueNAS CORE permanent.

Addition @oxyde, above talk about…

i opted for a simple stripe boot pool. I keep a second SSD drive (at least always have an usb stick) ready for fast install, in case something bad happen, but not more

Please teach me detail :smiley:

Thanks a lot.

as i say, is just my opinion about the need of redundancy on boot pool.
I think is a really good choice to mirror the boot pool when, for example:

  • phisically change a disk is hard
  • when (in case the boot pool fail), you can’t keep down your server for too much time (the installation time)
    because just restarting → change boot prio bring you back online very fast.

At the same time, if those 2 condition are not rilevant… you just waste a disk/a port.
With a fresh install on new disk, uploading the config backup, you will be online in reasonable time anyway

1 Like

Take the opportunity of server down-time to test the boot capability of resliver storage I think :laughing: