Dragonfish-24.04.2.5 > ElectricEel-24.10.0.2 upgrade zpool issues

I tried finding some mention of this issue before posting. I have not been able to find anything. I am not sure if I am searching the correct terminology or not.

My issues has been that multiple of my zpools are not showing up or accessible after upgrading to ElectricEel-24.10.0.2. The zpools are there once i revert back to my previous boot environment.

I was able to upgrade my second server without any issue. Granted, the zfs topology for that second server is different.

Here is my zpool status output for Dragonfish-24.04.2.5, prior to upgrading:

# zpool status
  pool: data
 state: ONLINE
  scan: scrub repaired 0B in 03:58:36 with 0 errors on Sun Nov  3 02:58:42 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	data                                      ONLINE       0     0     0
	  raidz2-0                                ONLINE       0     0     0
	    db871200-84a8-44de-bd88-1569a611649c  ONLINE       0     0     0
	    5dfbc200-046e-44f2-92eb-53304e9a569f  ONLINE       0     0     0
	    83d1582e-8dc2-432b-8f9c-6f517a204b8a  ONLINE       0     0     0
	    6e9300cf-d457-45b1-a316-ec4acb795ad6  ONLINE       0     0     0

errors: No known data errors

  pool: docker_data
 state: ONLINE
  scan: scrub repaired 0B in 00:04:33 with 0 errors on Sun Dec  1 00:04:35 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	docker_data                               ONLINE       0     0     0
	  raidz1-0                                ONLINE       0     0     0
	    e3a27bda-a9c4-4e1e-9b09-4539b400c24e  ONLINE       0     0     0
	    a910de4a-6436-4ff8-9e8d-09cddfcc938d  ONLINE       0     0     0
	    01aa132f-1cee-48b7-bf35-c165fc6812b2  ONLINE       0     0     0

errors: No known data errors

  pool: download
 state: ONLINE
  scan: scrub repaired 0B in 00:00:33 with 0 errors on Sun Dec  1 00:00:36 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	download                                  ONLINE       0     0     0
	  raidz1-0                                ONLINE       0     0     0
	    c7fdda47-2258-4011-883b-d605b738d5d6  ONLINE       0     0     0
	    ffdccea3-7a8a-4673-807c-12819d9f0fc4  ONLINE       0     0     0
	    1816451d-b2a4-49c9-8be1-723744010e8c  ONLINE       0     0     0

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
status: One or more features are enabled on the pool despite not being
	requested by the 'compatibility' property.
action: Consider setting 'compatibility' to an appropriate value, or
	adding needed features to the relevant file in
	/etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d.
  scan: scrub repaired 0B in 00:00:53 with 0 errors on Sun Dec  1 03:45:55 2024
config:

	NAME                                         STATE     READ WRITE CKSUM
	freenas-boot                                 ONLINE       0     0     0
	  mirror-0                                   ONLINE       0     0     0
	    ata-SATA_SSD_67F40763162400186094-part2  ONLINE       0     0     0
	    ata-SATA_SSD_AF34075A182400165065-part2  ONLINE       0     0     0

errors: No known data errors

  pool: media-01
 state: ONLINE
  scan: scrub repaired 0B in 13:47:59 with 0 errors on Sun Nov  3 12:48:03 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	media-01                                  ONLINE       0     0     0
	  raidz1-0                                ONLINE       0     0     0
	    b2003187-41b2-4350-b3a2-baac7c09da67  ONLINE       0     0     0
	    77093694-7838-4de6-942c-7b6619413440  ONLINE       0     0     0
	    7ce9120d-a90b-4cf9-9981-67d899da5ea2  ONLINE       0     0     0
	  raidz1-1                                ONLINE       0     0     0
	    0507ee9f-2b57-4429-9f74-7cd98ac3d070  ONLINE       0     0     0
	    04574f3f-8c2a-4eea-a098-b16bf7fec0e5  ONLINE       0     0     0
	    0e2c81f0-9af2-42ff-ab72-a124d06e39f5  ONLINE       0     0     0

errors: No known data errors

  pool: sec_vids
 state: ONLINE
  scan: scrub repaired 0B in 00:19:10 with 0 errors on Sun Dec  1 00:19:13 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	sec_vids                                  ONLINE       0     0     0
	  mirror-0                                ONLINE       0     0     0
	    65b22c9b-7552-47a8-b29c-c28942a472f0  ONLINE       0     0     0
	    85d9ea89-680d-4629-9da7-dd4cdeab824c  ONLINE       0     0     0

errors: No known data errors

Here is my zpool status output for ElectricEel-24.10.0.2, after upgrading:

# zpool status
  pool: data
 state: ONLINE
  scan: scrub repaired 0B in 03:58:36 with 0 errors on Sun Nov  3 02:58:42 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	data                                      ONLINE       0     0     0
	  raidz2-0                                ONLINE       0     0     0
	    db871200-84a8-44de-bd88-1569a611649c  ONLINE       0     0     0
	    5dfbc200-046e-44f2-92eb-53304e9a569f  ONLINE       0     0     0
	    83d1582e-8dc2-432b-8f9c-6f517a204b8a  ONLINE       0     0     0
	    6e9300cf-d457-45b1-a316-ec4acb795ad6  ONLINE       0     0     0

errors: No known data errors

  pool: docker_data
 state: DEGRADED
status: One or more devices could not be used because the label is missing or
	invalid.  Sufficient replicas exist for the pool to continue
	functioning in a degraded state.
action: Replace the device using 'zpool replace'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-4J
  scan: scrub repaired 0B in 00:04:33 with 0 errors on Sun Dec  1 00:04:35 2024
config:

	NAME                                      STATE     READ WRITE CKSUM
	docker_data                               DEGRADED     0     0     0
	  raidz1-0                                DEGRADED     0     0     0
	    e3a27bda-a9c4-4e1e-9b09-4539b400c24e  ONLINE       0     0     0
	    a910de4a-6436-4ff8-9e8d-09cddfcc938d  ONLINE       0     0     0
	    14829080919323958346                  UNAVAIL      0     0     0  was /dev/disk/by-partuuid/01aa132f-1cee-48b7-bf35-c165fc6812b2

errors: No known data errors

  pool: freenas-boot
 state: ONLINE
status: One or more features are enabled on the pool despite not being
	requested by the 'compatibility' property.
action: Consider setting 'compatibility' to an appropriate value, or
	adding needed features to the relevant file in
	/etc/zfs/compatibility.d or /usr/share/zfs/compatibility.d.
  scan: scrub repaired 0B in 00:00:53 with 0 errors on Sun Dec  1 03:45:55 2024
config:

	NAME                                         STATE     READ WRITE CKSUM
	freenas-boot                                 ONLINE       0     0     0
	  mirror-0                                   ONLINE       0     0     0
	    ata-SATA_SSD_67F40763162400186094-part2  ONLINE       0     0     0
	    ata-SATA_SSD_AF34075A182400165065-part2  ONLINE       0     0     0

errors: No known data errors

Any help will be greatly appreciated.

You probably want to start by checking if the disks actually show up.

  • Full hardware details please
  • Output from lsblk on both DragonFish and ElectricEel
  • How are the missing pool disks wired to the server?
  • How are the working pool disks wired to the server?

Certain connection methods are not recommended. Like USB or Thunderbolt. Some people have no trouble whatsoever, but others start off with no trouble using USB attached data pool disks until later. Perhaps like this.

Arwen,

Thank you for reaching out. It does look like my drives are not coming online for some reason. I am using a 24 bay supermicro chasis with a SAS backplane. I have 2 LSI SAS controllers for 16 bays and the motherboard’s SATA for the remaining drives. The missing drives are connected to both the LSI controllers and the motherboard SATA ports. I have moved drives around with not change in outcome. I do not have any USB connected drives. The OS is loaded on two SATADOMs directly connected to the motherboard.
I also have four 4TB NVME SSDs installed on two SABRENT PCIE adapters. However, those drives are still present on ElectricEel.

The LSI controllers are visible via lspci on both Dragonfish and ElectricEel.

06:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 02)
07:00.0 Serial Attached SCSI controller: Broadcom / LSI SAS2008 PCI-Express Fusion-MPT SAS-2 [Falcon] (rev 03)

Here are my lsblk output:

Dragonfish-24.04.2.5

# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINTS
sda           8:0    0  18.2T  0 disk  
├─sda1        8:1    0     2G  0 part  
│ └─md125     9:125  0     2G  0 raid1 
│   └─md125 253:2    0     2G  0 crypt 
└─sda2        8:2    0  18.2T  0 part  
sdb           8:16   0   1.8T  0 disk  
├─sdb1        8:17   0     2G  0 part  
│ └─md126     9:126  0     2G  0 raid1 
│   └─md126 253:3    0     2G  0 crypt 
└─sdb2        8:18   0   1.8T  0 part  
sdc           8:32   0   1.8T  0 disk  
├─sdc1        8:33   0     2G  0 part  
└─sdc2        8:34   0   1.8T  0 part  
sdd           8:48   0   1.8T  0 disk  
├─sdd1        8:49   0     2G  0 part  
└─sdd2        8:50   0   1.8T  0 part  
sde           8:64   0   1.8T  0 disk  
├─sde1        8:65   0     2G  0 part  
│ └─md124     9:124  0     2G  0 raid1 
│   └─md124 253:1    0     2G  0 crypt 
└─sde2        8:66   0   1.8T  0 part  
sdf           8:80   0   1.8T  0 disk  
├─sdf1        8:81   0     2G  0 part  
│ └─md124     9:124  0     2G  0 raid1 
│   └─md124 253:1    0     2G  0 crypt 
└─sdf2        8:82   0   1.8T  0 part  
sdg           8:96   0  18.2T  0 disk  
├─sdg1        8:97   0     2G  0 part  
│ └─md122     9:122  0     2G  0 raid1 
│   └─md122 253:0    0     2G  0 crypt 
└─sdg2        8:98   0  18.2T  0 part  
sdh           8:112  0  18.2T  0 disk  
├─sdh1        8:113  0     2G  0 part  
│ └─md125     9:125  0     2G  0 raid1 
│   └─md125 253:2    0     2G  0 crypt 
└─sdh2        8:114  0  18.2T  0 part  
sdi           8:128  1   7.6G  0 disk  
├─sdi1        8:129  1   800K  0 part  
└─sdi2        8:130  1    15K  0 part  
sdj           8:144  0  18.2T  0 disk  
├─sdj1        8:145  0     2G  0 part  
│ └─md122     9:122  0     2G  0 raid1 
│   └─md122 253:0    0     2G  0 crypt 
└─sdj2        8:146  0  18.2T  0 part  
sdk           8:160  0 931.5G  0 disk  
├─sdk1        8:161  0     2G  0 part  
│ └─md126     9:126  0     2G  0 raid1 
│   └─md126 253:3    0     2G  0 crypt 
└─sdk2        8:162  0 929.5G  0 part  
sdl           8:176  0  18.2T  0 disk  
├─sdl1        8:177  0     2G  0 part  
│ └─md124     9:124  0     2G  0 raid1 
│   └─md124 253:1    0     2G  0 crypt 
└─sdl2        8:178  0  18.2T  0 part  
sdm           8:192  0  14.9G  0 disk  
├─sdm1        8:193  0   260M  0 part  
└─sdm2        8:194  0  14.7G  0 part  
sdn           8:208  0  14.9G  0 disk  
├─sdn1        8:209  0   260M  0 part  
└─sdn2        8:210  0  14.7G  0 part  
sdo           8:224  0   1.8T  0 disk  
├─sdo1        8:225  0     2G  0 part  
└─sdo2        8:226  0   1.8T  0 part  
sdp           8:240  0 931.5G  0 disk  
├─sdp1        8:241  0     2G  0 part  
│ └─md126     9:126  0     2G  0 raid1 
│   └─md126 253:3    0     2G  0 crypt 
└─sdp2        8:242  0 929.5G  0 part  
sdq          65:0    0  18.2T  0 disk  
├─sdq1       65:1    0     2G  0 part  
│ └─md125     9:125  0     2G  0 raid1 
│   └─md125 253:2    0     2G  0 crypt 
└─sdq2       65:2    0  18.2T  0 part  
zd0         230:0    0    16G  0 disk  
nvme1n1     259:0    0   3.6T  0 disk  
├─nvme1n1p1 259:1    0     2G  0 part  
│ └─md127     9:127  0     2G  0 raid1 
│   └─md127 253:4    0     2G  0 crypt 
└─nvme1n1p2 259:2    0   3.6T  0 part  
nvme3n1     259:3    0   3.6T  0 disk  
├─nvme3n1p1 259:4    0     2G  0 part  
│ └─md127     9:127  0     2G  0 raid1 
│   └─md127 253:4    0     2G  0 crypt 
└─nvme3n1p2 259:6    0   3.6T  0 part  
nvme0n1     259:5    0   3.6T  0 disk  
├─nvme0n1p1 259:7    0     2G  0 part  
│ └─md127     9:127  0     2G  0 raid1 
│   └─md127 253:4    0     2G  0 crypt 
└─nvme0n1p2 259:9    0   3.6T  0 part  
nvme2n1     259:8    0   3.6T  0 disk  
├─nvme2n1p1 259:10   0     2G  0 part  
│ └─md127     9:127  0     2G  0 raid1 
│   └─md127 253:4    0     2G  0 crypt 
└─nvme2n1p2 259:11   0   3.6T  0 part  

ElectricEel-24.10.0.2

# lsblk
NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINTS
sda           8:0    0  18.2T  0 disk 
├─sda1        8:1    0     2G  0 part 
└─sda2        8:2    0  18.2T  0 part 
sdb           8:16   0 931.5G  0 disk 
├─sdb1        8:17   0     2G  0 part 
└─sdb2        8:18   0 929.5G  0 part 
sdc           8:32   0   1.8T  0 disk 
├─sdc1        8:33   0     2G  0 part 
└─sdc2        8:34   0   1.8T  0 part 
sdd           8:48   0  14.9G  0 disk 
├─sdd1        8:49   0   260M  0 part 
└─sdd2        8:50   0  14.7G  0 part 
sde           8:64   0  14.9G  0 disk 
├─sde1        8:65   0   260M  0 part 
└─sde2        8:66   0  14.7G  0 part 
sdf           8:80   0 931.5G  0 disk 
├─sdf1        8:81   0     2G  0 part 
└─sdf2        8:82   0 929.5G  0 part 
sdg           8:96   0  18.2T  0 disk 
├─sdg1        8:97   0     2G  0 part 
└─sdg2        8:98   0  18.2T  0 part 
sdh           8:112  1   7.6G  0 disk 
├─sdh1        8:113  1   800K  0 part 
└─sdh2        8:114  1    15K  0 part 
zd0         230:0    0    16G  0 disk 
nvme2n1     259:0    0   3.6T  0 disk 
├─nvme2n1p1 259:1    0     2G  0 part 
└─nvme2n1p2 259:2    0   3.6T  0 part 
nvme0n1     259:3    0   3.6T  0 disk 
├─nvme0n1p1 259:4    0     2G  0 part 
└─nvme0n1p2 259:5    0   3.6T  0 part 
nvme3n1     259:6    0   3.6T  0 disk 
├─nvme3n1p1 259:7    0     2G  0 part 
└─nvme3n1p2 259:8    0   3.6T  0 part 
nvme1n1     259:9    0   3.6T  0 disk 
├─nvme1n1p1 259:10   0     2G  0 part 
└─nvme1n1p2 259:11   0   3.6T  0 part

I am not an expert in the LSI SAS controllers, but I would check the firmware requirements for both DragonFish and ElectricEel. Who knows, maybe you need to perform a firmware update.

Also, (again, I am no expert), some people have said their are LSI SAS card knockoffs. These are cards that take barely functional LSI SAS controller chips that failed the manufacturer’s testing, but someone got hold of the chips and made PCIe cards with them. They then sold them as if they are fully functional cards. Their are hints here in the forums about them, (both LSI SAS cards and Intel NIC cards).

Perhaps someone else will have better ideas.

I’ll have to dig a bit into that once I get some time. I’ll report my findings when I do.

Thanks again for the good tips and information.

I finally had some time to play with this. Long story short the SAS9211 support seems to have been dropped from the MPT3SAS module packaged with ElectricEel 24.10 which is using the Linux 6.6.44 kernel. Dragonfish used the Linux 6.6.32 kernel. I found others having issues with these 9211s in various forums. Some have been able to get them to work by recompiling the older version of the module. However, that is not something I want to do. That means that unless the trueNAS team decides to compile the older module or make modifications to the new module, I would be doing this manually every time a new update is available. I am not about that life…!

So, I decided to get a new SAS controller. I opted for a LSI SAS 9305-16I. I bought this one from ebay: SAS9305-16I LSI SAS 9305-16I 16 Port PCIe 3.0 x8 12 Gb/s Host Bus Adapter FH | eBay. It was $64.00 USD at the time of purchase.

Once I got it, I flashed the latest IT firmware I was able to locate from broadcom. 9305_16i_Pkg_P16.12_IT_FW_BIOS_for_MSDOS_Windows

Looking at the module info under both Dragonfish and ElectricEel, you can see the source version has changed.

Dragonfish

srcversion:     595E9DBF21F4E99E6DDADC4

ElectricEel

srcversion:     E46E6B71C385F485C977A1B

The MPT3SAS kept pushing the following errors when trying to load under ElectricEel:

[    1.933596] mpt3sas version 43.100.00.00 loaded
[    1.933693] mpt3sas 0000:05:00.0: can't disable ASPM; OS doesn't have ASPM control
[    1.948338] mpt3sas 0000:05:00.0: BAR 1: can't reserve [mem 0x804c0000-0x804c3fff 64bit]
[    1.977139] mpt2sas_cm0: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:12413/_scsih_probe()!
[    2.040607] mpt3sas 0000:06:00.0: can't disable ASPM; OS doesn't have ASPM control
[    2.055251] mpt3sas 0000:06:00.0: BAR 1: can't reserve [mem 0x80ac0000-0x80ac3fff 64bit]
[    2.084212] mpt2sas_cm1: failure at drivers/scsi/mpt3sas/mpt3sas_scsih.c:12413/_scsih_probe()!

I did validate these 9211 cards had the latest firmware loaded. As of this recording, the latest available is the following: 9211_8i_Package_P20_IR_IT_FW_BIOS_for_MSDOS_Windows

2 Likes