hey, back with another issue lol…so ive been playing with EE since beta came out and finally was able to upgrade and extend my main pool…i had issues at first cause the extend just stopped at 25% and said i needed to scrub/resilver or something but i wasnt too sure what i was supposed to do…so i let it go for a few days seeing if anything would happen on its own lol.
after nothing happened i decided to scrub the whole pool to see if that was the command it was waiting for, and it seemed like it was…had to wait like another day for the scrub, and then another day or two for the extend to complete…
so this was all in all a very long and boring process for me haha…but now my problem is i dont see the added storage from the extra drive i added to the pool, its still 7tb out of a 4xwide pool with 4tb drives in raidz1
so i just let it be after that, rebooted a couple times over a few days to see if it was just being buggy, but its still only showing the 7tb max size. and now i obviously dont have the “assign disk” option anymore cause it apparently added the drive already…so im stuck unless theres something you guys know to fix this or if im out of luck until a fix comes out, idk lol…
To quote Sir Humphrey, it seems extremely “courageous” to try the extremely new functionality in EE, especially since you appear not to have implemented standard data protection recommendations like regular scrubs etc.
If a reboot doesn’t make the extra storage appear, I would suggest that your only solution would be to move the data elsewhere, destroy the pool and recreate it, and then move the data back again.
(ZFS has all sorts of self correction mechanisms to ensure that it stays uncorrupted, but that does mean that if it ever does become corrupted then there is no fsck utility to attempt to fix it.
TBH, you should probably be thanking your lucky stars that the pool is still online and not so corrupted that it won’t mount and your data is gone.
…they used to be set weekely but just made them monthly…i also have snapshots
but damn, yea the stuff i had in this pool isnt really important at all, and i did take precautions and backed up the most important data to a drive on my client pc and in a different pool that i havent tried to uprgade or anything…
so i did a good job being somewhat careful before going ahead with the upgrade lol…
but alright im probably just going to kill the pool then and create a new one(glad cause i want a shorter pool name lol) and backup the data…
Well that will teach me not to be a clever-dick and make assumptions. So apologies are in order - sorry!
It sounds like you took all reasonable precautions knowing it was a bit risky, and if it had worked it would have been a good decision, but unfortunately it didn’t pan out.
I’d like to see if the output of zpool status TrueNas_Main_Storage shows the 5th drive. This could simply be a bug in the GUI that is yet to be squashed. Even possibly unknown bug!
So @techdan91 if you have not destroyed the pool, please, (can I beg?), give us the output of zpool status in code tags.
alright so my systems probably buggin out on what to do haha, ive done too much too it and hope i can get it stable…but the line you asked me to put showed me some interesting info…
so i did in fact try export/disconnecting the pool the other day, but it doesnt seem to have deleted it or anything( but i do get those popup notifications saying there was an error in the path for etc/TrueNas blah blah blah…
so i tried your command and it gave me info showing all 4 disks are there(before the expansion there were only 3 obviously, issue is not the disks showing but the capacity its showing is a disk too low in tb), but it also shows that the expansion is still in progress and will take another 7 days!!! lmfao…idk but thats what it reads like to me…so maybe just let it sit for another week and see if it “finishes” and then see if the capacity pops up to ~10tb?..thanks for the advice on the command to check the pool!
Sorry, I got confused. Thought it was a 4 disk pool expanding to 5… Oh well, glad of the correction and complete information.
The original indicated this type of disk; (3) 4tb WD blue hdd (1) WD Red (Raidz1) (Misc Storage/SMB)
The 4TB WD Blue HDD are likely SMR, so EXPECTED behavior to be SLOW. Taking 7 more days sounds about right.
One comment about OpenZFS RAID-Zx vDev expansion. If I understand it correctly, it does not calculate the parity or checksum it read off disk. This is to improve the speed. Thus, a ZFS Scrub will be performed after to catch any pre-existing data faults. So count on another week or 2 of work to be done.
Actually, depending on the exact model WD Reds can also be SMR (some models are CMR, some SMR). WD explicitly states in the WD Red specification on their web site that WD Red are unsuitable for ZFS and that you need WD Red Plus or Pro for ZFS.
Hmm, I wanted to see it for myself so I went to Western Digital’s web site and tried to pull up WD Reds. Got 404 Page Not Found. The WD Red Plus and Pro were fine, just the plain Reds.
Seems they have a web page problem, which I won’t bother reporting. Or perhaps its a prelude to removing the Reds because they still get, (costly), returns because of SMR.
yaaaay!!! after a month of starting to expand the pool, it finally finished today!..my 4th 4tb hdd finally attached to my pool and have almost 3tb more of free space
…idk why it took so long, but over the past week or two i had to keep clearing the errors in the shell for the pool cause it would frequently pause and require a resilver or clear to continue, was very annoying…
Even the errors can be tracked back to SMR drives, which under extreme write loads can have a long delay in any response. Thus, ZFS may think the drive is failing.
This effect is also caused by desktop drives, like the WD Blue line, where the drive has either of these 2 features enabled and not tuned for NAS work;
Aggressive head parking, like 5 second intervals. Then taking too much time to un-park causing ZFS to think it needs to retry the prior command. Thus, error.
TLER, Time Limited Error Recovery, (Seagate has another name for it, but same concept). This feature is for desktop drives such that when a block is found bad, longer TLER, (like 1 minute by default on most desktop drives), to retry reading the block and applying ECC to it. Since NASes usually have redundancy at a higher level, a NAS, (like with ZFS), may declare an error on the drive. Using 7 seconds allows ZFS to apply the pool level redundancy and move on quicker. Still an error but more “normal” for ZFS.