Hi everyone. I have the similar issue as the topic starter in this thread, 25 disks unassigned (i.e. all disks from the EMC disk enclosure), lsblk reports those disks twice. My HBA is LSI SAS2008 with the most recent firmware (P20). Probably this is the wrong wiring in my case, but the issue in its current state appeared just last week after the upgrade to the recent Scale beta. Before I had a similar issue, but it was only 1 disk reported as unassigned, usually spare.
root@NAS[~]# sas2flash -list
LSI Corporation SAS2 Flash Utility
Version 20.00.00.00 (2014.09.18)
Copyright (c) 2008-2014 LSI Corporation. All rights reserved
Adapter Selected is a LSI SAS: SAS2008(B2)
Controller Number : 0
Controller : SAS2008(B2)
PCI Address : 00:1f:00:00
SAS Address : 500605b-0-055d-6ec0
NVDATA Version (Default) : 14.01.00.07
NVDATA Version (Persistent) : 14.01.00.07
Firmware Product ID : 0x2213 (IT)
Firmware Version : 20.00.07.00
NVDATA Vendor : LSI
NVDATA Product ID : SAS9200-8e
BIOS Version : 07.39.02.00
UEFI BSD Version : N/A
FCODE Version : N/A
Board Name : SAS9200-8e
Board Assembly : N/A
Board Tracer Number : N/A
Finished Processing Commands Successfully.
Exiting SAS2Flash.
As for the wiring, my EMC VNX 5300 25-Slot disk enclosure has 2 controllers, in both primary ports are connected to HBA ports (so there are 2 cables connecting the enclosure and the HBA).
Sounds a bit like multipathing going on. Im not familiar with this in SCALE but am in CORE. Is there even a multipath tab these days in SCALE under storage or disks?
iX moved away from multipathing a few years ago so I would personally be tempted to wire the JBOD to just one of the controllers. Not sure if there is any on disk metadata that is stamped on the drives when multipathed but this can happen in CORE so there is a chance even after that change you still get funnies.
The pool was created in Scale. Cobia if I am not mistaken.
Btw, does not Scale use multipathing at all?! I always considered it a way to improve both reliability and performance…
In a nutshell they took the stance that the small benefits outweighed the potential consequences for miss-configuration. Wide porting allows you to achieve very similar results with essentially zero potential negatives.
I personally used multipath a lot for about 5-6 years and it was great but I completely understand where they were coming from with this and I currently no longer use it as a result.
Well, it sounds reasonable. The only issue is that the pool was created in Scale, which as far as I understand never had multipathing. Im I right that wide porting also undertakes using multiple wires to an array? If so than it has to be active at the moment. If not I have no idea or the actual topology anymore… How to check?
Again I can’t speak for SCALE as my personal knowledge is limited but as to wide porting this is essentially two cables from HBA to two ports on the same controller. Often ports A & B but this is dependent on HW.
In CORE just the act of cabling to two controllers on the JBOD would auto apply multipath whether you liked it or not. Not sure if the same still happens in SCALE.
I have just rebooted again to see if something will change. Now the pool is suspended due to tons of read errors on multiple disks, which I did not have before:
Ok so nothing to lose at this point. I’d be tempted to unplug the SAS cable to the secondary controller and see what happens. If you can gracefully power down first then even better.
While it is shutting down can you please clarify it a bit. In the other thread I have seen that for the wide porting both HBA ports should be connected to the same JBOD. Does it mean to the same DAE controller (its primary ant extension ports) or to different controllers (primary ports)?
So to wide port you’d use a single HBA and run two SAS cables from the HBA to one controller on the JBOD. You could try and re-wire this way now or for now I’d be tempted to keep things simple and just run one cable from HBA to JBOD. As @joeschmuck said this could also be a cable issue so let’s try things slowly.
I did upgrade ZFS version yesterday, but the pool was working fine after that. As for the rest I did not touch anything.
I nave replaced one failing disk a few months ago and had periodical issue with a 1 unassigned disk.
So I have shut the thing down, detached one of the cables, checked the other (unsurprisingly found no issue with it), booted up. No issues! Thank you guys!
In theory yes but call me old fashioned I always prefer shutting down.
I would first ask yourself do you NEED the second connection? Looks like you’re using SAS2 so each cable can provide 24Gb/s. Once you’ve setup your zpool it may not even be capable of pushing 24Gb/s and if not then there is little benefit in attaching the second cable but I’ll leave that for you to decide.
Sigh… You are probably right. Especially having that I have 10K rpm disks. But w/o checking I would not know if it can make things a bit better )). I have to come back to it when I’l add another DAE.