Pool showing offline and data not available

Hi All
Please treat me as a complete and utter newb, I am not an engineer and don’t work in in this area.
About two years ago I set up a server with TrueNas to simply use as a large long term data storage for recordings of wildlife research, about 20tb in z2 across approx. 20 2tb drives.
Yesterday I booted this up, first time in about 6 months to run a scrub of the data, all good not problems.
Today when I booted it up it is saying ‘pool offline’ and data not available.
I have not changed any drives, I have not imported or exported any data or created any new pools on the server etc.
I have seen others have had similar issues after doing something, but i cant see or think of anything that could have been done to cause this.
My only thought is that after not being on for a while if there has been a update/s that have triggered this.
Any ideas?
Cheers Daniel

That’s wide :grimacing:
Edit: may I suggested reevaluating the pool setup and maybe go for 5xX TB RAIDZ2? Or 4x XTB Raidz1?

Please post your complete hardware (especially what drives, how are they connected and the power supply).

Also post the output of zpool status

Hi Chuck
I have a isilion 36 drive server with 2tb drives. (Hitachi)
Power supply??? Server is plugged in the mains?

It is currently showing offline and data not available, so i cant even see the zpool?

However, I have just seen this come up in alerts -

WARNING

New ZFS version or feature flags are available for pool(s) AEWC DATA. Upgrading pools is a one-time process that can prevent rolling the system back to an earlier TrueNAS version. It is recommended to read the TrueNAS release notes and confirm you need the new ZFS feature flags before upgrading a pool.

What type of controller card are you using for the 20 drives?

The other part of the cable that is not connected to mains is connected to the power supply.

Type that command in the shell / via a SSH session.

Hi
There isn’t as far as I understand anything other than a mains power supply. The server is connected vis a C19 power cable.

This is what the command came up with, but seems to be only the boot pool?

root@truenas[~]# zpool status
pool: boot-pool
state: ONLINE
scan: scrub repaired 0B in 00:00:29 with 0 errors on Fri May 3 03:45:29 2024
config:

    NAME        STATE     READ WRITE CKSUM
    boot-pool   ONLINE       0     0     0
      ada0p2    ONLINE       0     0     0

errors: No known data errors
root@truenas[~]#

The power supply is the unit inside the server you plug in your power cable, coming from the wall.

Whats the max Watts it can supply and the type of cabling used going to the drives ? There should be a sticker on it.

Also again, how are your 20 drives connected to your motherboard ?

Hi Farout
Got it! looks like it 1200 watts max. It can actually have 2 plugs and 2 power supplies, but only needs 1.

Sorry, i have no idea what cabling goes to the drives or how they are connected to motherboard. This is a Isilon IQ X series server, i plug drives in and that’s it I’m
a complete newb. I have attached photo.

Before we all go on a wild goose chase: open up the server and check all the cabling, if nothing came loose.

Hi, Ok, but that’s not that easy. This is in a server cabinet on rails and weighs a ton! It would seem odd for anything to get lose when this is in a server cabinet and hasn’t moved and worked yesterday and is only not working now i have warning about update?

I recommend that you DO NOT UPGRADE THE POOL, at the very least until your current problem is solved. And, even then, I imagine the recommendation is likely to be the same, unless it is determined that you can benefit from the new feature flags available.
There is about nil chance it is a contributor to your problem.

1 Like

20 HDs cause vibrations. A not properly plugged in cable can come loose.

Im am not familiar with the type of controller card used in this server to connect to your backplane with the drives. But i think its important to know the exact type and model.

Well, I would say that it is quite possible that a card has moved in a slot and needs to be reseated, or something similar. You described that you had not run the server for some time, then switched it on an ran a scrub. That’s an intensive activity that likely generated some heat and may have caused some thermal stress in one or more components, with resulting movement.
Or something could just have failed. You might try plugging the power cable into the second power supply as a test/check.

1 Like

I have had a look inside, and cant see anything untoward, but you cant see everything. Photos attached.
It looks like the drives slot directly into a number of boards.


Ok, thanks. i have tried 2nd power cable and still the same.

I think that you have to hope that there’s somebody on this forum familiar with this hardware, or perhaps go on reddit in pursuit of same.

BTW, you haven’t given any info on the model number that would allow search for manuals or other recognition material.

Are there any power or activity lights on any of the drive bays?

The problem I see is that you dont know, if its a hardware failure.
Could be the controller card for your drives. A controller card, that I tried to find some info on, but failed. Your Serial # suggests, that you have a Isilion IQ 36NL. From ca 2010. So the server is getting a bit long in the tooth and failing hardware would not be something unheard of.

As @Redcoat wrote, your best chance is to find someone who knows this system. Maybe reach out to DELL - who owns this brand now.

If for some reason your pool suddenly comes online, I highly recommend to back up your data ASAP and if you still need to, rebuilding a NAS with a lot less drives on newer (and simpler) hardware that you can manage yourself with your knowledge level.

I don’t think we got an answer to that, but please check whether the drives show up in BIOS / under Storage / Disks in Truenas / lsblk should also show them.

I fear they don’t show up at all and hence I concur with the others and suspect a hardware failure on the controller / backplane.

:point_up:

Thanks all for the advice.

When I start it up all the drives flash and light up and seem ok. if I go onto TrueNas >storage>disks all 24 external disks are shown.

I haven’t tried doing anything else, such as SMART tests of disks etc.

I understood that these servers had pretty much dual everything, so any one failure means it should still work, power supply, hard drive etc. So, I do also have a second one of these servers exactly the same. SO in theory, if this has failed, drives could be moved to the second one to get data?

Hi Farout
Thanks for the advice, I believe this is a Isilon IQ 72000x.
The issue is the high storage space I need (around 50TB) and cost etc. This is a older system, but higher end and came out of the genome project servers, and sold as a more failsafe system. it might be complex hardware, but i was only using it as a simple hard drive.