How Screwed am I? (pool offline, please help)

green_man_5 · May 14, 2024, 10:53pm

My array has been having issues for some time (over a year) now with drives becoming unavailable and errors but the drives are all good when I do tests in windows so I’ve been reading them to the array

current system is

2 vdevs of 10x WD red pro 20TB in raid-z2
lsi 9305-24i HBA
AMD Epic 7443P
ASRock ROMED8-2T motherboard
512GB ECC Ram
Silverstone RM43-320-RS case

I’ve tried another 9305 HBA, sas cables, cases (i switched from norco 4224), re-setting the cpu. but this problem keeps prosisting

Hell broke loose yesterday

the pool went offline, restarts haven’t restored it.

As you can see from the image 2* drives are offline and 1 was in the process of replacing itself. 8 out of 10 drives still should be able to continue with a raid-z2 vdev but when i go to import the pool it doesnt give me the option in the gui so I tried using

zpool import

in the shell but i get the following error

How screwed am I? What can I do to recover my data? (of course no backups)

winnielinnie · May 14, 2024, 11:00pm

Why did you wait that long before replacing a disk, let alone investigating further?

You can override it with the -f flag. ~~But first, do you have a working backup?~~ Never mind. Just saw that.

MBILC · May 14, 2024, 11:25pm

Why do people wait until the worst possible scenario to happen before doing proper backups…

And for 1 year having issues, well this may be a tough lesson learned for you…

/end grumpy old IT guy rant

essinghigh · May 14, 2024, 11:25pm

No backups, no problems, what could possibly go wrong

As soon as you saw a disk drop out and become unavailable you should have investigated exactly what the root cause is. These things don’t happen by magic, and leaving it (for a year???) will greatly increase your chances of having the exact issue you’re having now.

What you might want to do is import the pool as readonly zpool import -f -o readonly=on -R /mnt and pray that you can access some recoverable data.

Stux · May 14, 2024, 11:49pm

It sounds to me like the OP has valiantly tried to figure out what the issue is over the last year.

It also seems like the pool should import successfully if forced.

essinghigh · May 15, 2024, 12:08am

I definitely don’t want to invalidate his troubleshooting, it’s all perfectly reasonable to do if you believe the disks are sound, am just a little shocked the disks weren’t replaced for over a year after multiple issues, I’d have been freaking out from day one!

Taking another look at the screenshot again yeah it does look like it should import if forced as there are two disk failures, so data should be intact. Genuinely hoping there’s a recovery out of this, and hopefully a lesson learned (r.e. backups).

green_man_5 · May 15, 2024, 1:05am

Over the course of the year I have replaced every single compnet in the syustem, i’ve tryed 2 differnt HBA’s new sas cables, new case (backplanes) fresh truenas install, new cpu platform, I did have the same issue with 8TB drives so upgraded to 20TB drives, I have ran out of things to replace.

Yes I know I should have had backups but I hadn’t got to the root of the problem with all the “old” parts so once I had I was going to use them to make a backup system
It is hard to backup when you have so much

1 is none and 2 is 1

@essinghigh I’ll give that a go

Stux · May 15, 2024, 1:08am

How are you cooling the HBA?

(Btw I have a Norco 4224 )

green_man_5 · May 15, 2024, 1:17am

yes I have a 80mm noctura fan pointed directly at it

I also had been using Norco 4224 but in my troubleshooting I have heard of some people haveing issue with the backplane so I bought a silverstone rm43-320 to try and fix the issue but it didn’t

Stux · May 15, 2024, 1:19am

Fair enough. The backplane issues were gen1 iirc. It’s ancient history to look it up, but I bought mine circa 2016, and it didn’t have the issue.

Anyway, maybe when you’ve solved this issue, try another thread to try to get to the bottom once and for all.

There’s no standout reason you should be seeing issues afaict

green_man_5 · May 15, 2024, 1:25am

I issued the command as you stated
I can see the datasets but i cant see the pool on the storage page

When I go to create a share toget the data off i get the following error.

[EINVAL] sharingsmb_create.path_local: The path must reside within a pool mount point

Error: Traceback (most recent call last):
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 198, in call_method
result = await self.middleware.call_with_audit(message[‘method’], serviceobj, methodobj, params, self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1466, in call_with_audit
result = await self._call(method, serviceobj, methodobj, params, app=app,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1417, in _call
return await methodobj(*prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/service/crud_service.py”, line 179, in create
return await self.middleware._call(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/main.py”, line 1417, in _call
return await methodobj(*prepared_call.args)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/service/crud_service.py”, line 210, in nf
rv = await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 47, in nf
res = await f(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/schema/processor.py”, line 187, in nf
return await func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “/usr/lib/python3/dist-packages/middlewared/plugins/smb.py”, line 1022, in do_create
verrors.check()
File “/usr/lib/python3/dist-packages/middlewared/service_exception.py”, line 70, in check
raise self
middlewared.service_exception.ValidationErrors: [EINVAL] sharingsmb_create.path_local: The path must reside within a pool mount point

green_man_5 · May 15, 2024, 1:31am

Only 8TB of the 120TB used is what I need to get off as it can’t be replaced

winnielinnie · May 15, 2024, 1:33am

Did you include the -R /mnt portion in your import command?

Stux · May 15, 2024, 1:42am

Once successfully imported. Export again. Then try importing via ui.

green_man_5 · May 15, 2024, 1:46am

yes I copyed the command exactly. I did add the pool name at the end

green_man_5 · May 15, 2024, 1:50am

what command do i use to export the array as its not showing up in the gui

Stux · May 15, 2024, 1:52am

zpool export pool_name

Then import it again in the GUI

Pools → Add, import existing

green_man_5 · May 15, 2024, 2:02am

Thanks, So far its taking a long time to import via ui.
the gui doesnt have an option for read-only, realy hope that doesn’t bite me in the butt

winnielinnie · May 15, 2024, 2:04am

Are you using TrueCharts?

green_man_5 · May 15, 2024, 2:09am

I Was. but I have been using host parth as Tom Lawrence suggested

The monitor attached to the server