Input / Output Error when Creating New Pool

The Error:
TrueNAS is throwing the error “Error: [EFAULT] Failed to wipe disk sda: [Errno 5] Input/output error” when trying to create a Pool through the Storahge > Pool Creation Wizard. Disk “sda” is just one of many disks that have popped up with this error.

image

The Hardware Setup:
I am running TrueNAS Scale v24.04.2 on an HPE 325 ProLiant Gen 10 server with 2x LSI 9206-16e Quard Port HBA’s. Attached to those HBA’s are 3x Newisys NDS-4600 JBOD’s, with 60 Disks per enclosure - totaling 180 HDD’s. Each disk is a 6TB Seagate Exos 7e8 SAS disk, model number: ST6000NM0195.

My desired pool configuration is:
Data: 18 x RAIDZ2 | 10 x 5.46 TB (HDD)
Log: 1 x STRIPE | 1 x 372.61 GB (SSD)
Cache: 1 x 2.91 TB (SSD)

Unsuccessful Solutions:
I have confirmed my disks are SED (Self Encrypting Disks) but according to a post I read earlier, TrueNAS has worked with SED disks since vs 11 or 12. I have tried using wipefs on the disks to ensure they are completely empty and still run into this issue.

Current Rabbit Hole:
Using the iLo to view the host console, I see an error popping up for multiple disks which states: “Sense Key: Data Protect” “Add Sense: Access Denied - No Access Rights” I’m not familiar enough with Linux or TrueNAS to know if this is an actual issue or a red herring, but it would make sense for TrueNAS to throw an error while trying to Wipe a disk if the disk has a portion of it that is protected. Seagate defines the “Data Protect” Sense Key in this chart here: SCSI Sense Key Chart | Seagate US

Ultimately, I need to solve the TrueNAS error listed at the beginning of this post, but I’m thinking it’s related to the Access Denied error on the disk.

Welcome to the forum.

I will not be able to directly solve this issue for you with science however (I’m making an assumption) since the pool is empty, I would try something a little backwards. I wish I knew more about self-encrypting drives. Sounds easy but I don’t know the details behind it.

  1. Install TrueNAS CORE.
  2. Create your pools.
  3. If everything seems to be working, then upgrade to SCALE. NOTE: Do Not Upgrade the Pool Feature Set. Not yet at least. And never if yo don’t need to.
  4. With some luck, your pools will remain in-tact.
  5. Give the system test and hopefully you no longer have that same issue.

OR you could boot Ubuntu Live from a USB stick/DVD and wipe the drives.

Did you search the internet for “Error: [EFAULT] Failed to wipe disk sda: [Errno 5] Input/output error” as it does have quite a few hits.

Thanks for the ideas @joeschmuck I’ve been scouring the internet the last 2 days with this error and haven’t had much luck with other people’s solutions. The majority of the related posts around this error are actually for “Failed to wipe disk sda” but with a different error other than “Input/output error”. I’ve come across some that are resource-related. The SED was the one that most closely resembled my issue, but after researching more about TrueNAS and SED, I saw that TrueNAS supposedly works with SED out of the box now.

I will go ahead and try TrueNAS Core and upgrade to TrueNAS Scale to see if that works. Thanks for the idea.

I am not an expert on this however there is a sedutil-cli utility (that comes pre-installed with TN SCALE) that allows you to configure Self-Encrypting-Drives.

According to this page all you need to do (LoL - for 180 drives this is non-trivial) is:

  1. Access the physical drive and note down the PSID which is written on the drive label. Since every drive will have a different PSID, you will also need to note the drive serial number.

  2. Run suitable commands to list out the device name, serial number and sed status for each drive. A guess for possible ways to achieve this are:

    • sudo sedutil-cli -vn --scan
    • lsblk -T -o NAME,TYPE,SIZE,SERIAL
  3. Run the following command to reset the drives you need to reset:

    sudo sedutil-cli --yesIreallywanttoERASEALLmydatausingthePSID PSID /dev/DRIVE
    

As I say, I am not an expert and don’t have SED drives myself, so this is simply desk research.

I find that stepping outside the normal box sometimes can resolve the problem at hand. It may not fully identify the problem however you wanted to, you could file a bug report before you destroy the SCALE on your system. If it is too late, no problem. If it returns, definitely file a bug report.

And I like what @Protopia has sed above, Ha Ha, sed - said. I crack myself up.

Good luck and please post what fixes the issue, I’d like to note that in a document.

I know you didn’t ask for this, but a quick review of your pool definition…

  1. Data vDev definition looks good. Your vDevs are a reasonable width and you are using RAIDZ2.

  2. L2ARC definition looks OK providing that your server has at least 64GB of memory and ideally at least 128GB of memory in order to support a 3GB L2ARC.

  3. SLOG - I am not sure what your use case is for this server, however you should confirm that you need synchronous writes because regardless of SLOG, synchronous writes have a larger overhead than asynchronous writes. If you do need synchronous writes, then I assume that you are prepared to lose some writes in the event that the unmirrored SLOG fails.

P.S. I would not destroy your SCALE install and install CORE unless you cannot find any way under SCALE to resolve this issue (or unless someone more expert than I can give hard evidence as to why CORE is better for your large-sized use case than SCALE).

GROAN!!!

@joeschmuck Are you actually a dad or are you simply impersonating one? (I crack myself up too.)

I’m always open to suggestions and thoughts from others. Thanks for throwing this out there. The use-case for this TrueNAS implementation is to house our 600+ TB video archive that’s currently hosted off-site.

  1. I took the vDev and RAIDZ2 configuration from LTT and it made sense for our use case.
  2. The system has 256 GB’s of DDR4 3200 ECC RAM. We have plans to expand that further.
  3. I’m waiting on more SSD’s to come in order to put the SLOG on a mirrored drive. You’re not wrong here - but it wasn’t our primary concern to address right away.

Luckily, when it comes to tearing things down, I don’t have a quick trigger finger, so I have not destroyed the SCALE installation, and I have not installed the CORE installation.

Well, I welcome any solution, but you’re not wrong about this being quite the undertaking. It’s even more-so, when I tell you that 180 drives is only 1/3 of the drives we’re planning for implementation. We have 6 more JBOD’s with the same spec’s, but they’ll be installed at a different site, once we work out the “kinks” with this install / configuration.

Before you expand this, check your ARC stats to see if you need more. (Just a hunch, but I suspect that you will get a pretty decent ARC hot rate with 256GB and might not need an upgrade.)

Yes - but do you even need an SLOG or would asynchronous writes be better (forcing them if necessary).

40 years as an IT Professional / Project Manager / Programme Manager gives me a good feel for the realities of these sorts of things.