PMEM not showing serial

Heya everyone. I’ve got a Dell R740 that I’m experimenting with currently with Scale installed bare metal. It’s got a Cascade Lake CPU and I got myself a few 128GB Optane PMEMs to see how they would perform as storage.

I’ve never used PMEM before so I’m not certain what may be causing this, but two of my modules do not show a serial number. Because of this when trying to create pool using them I get a warning that two disks have non-unique serial numbers.

Any ideas why they wouldn’t show serials? As I said, not very familiar with using PMEM at all so maybe there’s something about how they were used previously that I’m missing, etc as it was all used hardware.

I believe the two that don’t show serials also appear to report as different sized disks. When looking in the bios I can see that there are two modules with 98% write endurance while the rest are at 100%. I am assuming this is part of the complication.

image

What happens if you temporarily remove all PMEMs except for the two without serial numbers?
Next, what happens if you then move those two to two of the now empty slots previously used by the working modules?

Edit:
Guessing here, but perhaps running through this sanitisation procedure could be helpful:
https://www.intel.com/content/www/us/en/developer/articles/training/how-to-securely-erase-data-on-intel-optane-persistent-memory.html

Ok yeah I’m picking up what you’re putting down here. Isolate if it’s the modules themselves or something outside of the modules. I can try a sanitize too, may as well see what happens!

1 Like

I have seen the exact same setup with the exact same behavior, My server will only see 4 serial numbers. I have tested this with both 128GB and 256GB modules.

It also doesn’t matter if your running single or dual CPU’s it still only shows 4 serials.

nope tried that.

I’ve tried updating firmware which made no difference. It seems odd to me that in both our cases it apparently cuts off after 4 serials. Is this be a TrueNAS issue? Known oddities with PMEM?

That brings up another vector to check.
Check with a live boot OS on a USB-stick and see if it has the same limits.

Do these show anything?
ipmctl show -topology
ipmctl show -memoryresources
ipmctl show -dimm

I’ll add that this from a Dell SuperMicro motherboard manual, that coincidentally only shows up to 4 modules being used:

I’ve went ahead and loaded up the same DIMMs in another system, updated the firmware on them, and gave this a go.

ipmctl show -topology
 DimmID | MemoryType                  | Capacity    | PhysicalID| DeviceLocator
================================================================================
 0x0001 | Logical Non-Volatile Device | 128.000 GiB | 0x1106    | A7
 0x0011 | Logical Non-Volatile Device | 128.000 GiB | 0x1107    | A8
 0x0021 | Logical Non-Volatile Device | 128.000 GiB | 0x1108    | A9
 0x0101 | Logical Non-Volatile Device | 128.000 GiB | 0x1109    | A10
 0x0111 | Logical Non-Volatile Device | 128.000 GiB | 0x110a    | A11
 0x0121 | Logical Non-Volatile Device | 128.000 GiB | 0x110b    | A12
 0x1001 | Logical Non-Volatile Device | 128.000 GiB | 0x1112    | B7
 0x1011 | Logical Non-Volatile Device | 128.000 GiB | 0x1113    | B8
 0x1021 | Logical Non-Volatile Device | 128.000 GiB | 0x1114    | B9
 0x1101 | Logical Non-Volatile Device | 128.000 GiB | 0x1115    | B10
 0x1111 | Logical Non-Volatile Device | 128.000 GiB | 0x1116    | B11
 0x1121 | Logical Non-Volatile Device | 128.000 GiB | 0x1117    | B12
 N/A    | DDR4                        | 32.000 GiB  | 0x1100    | A1
 N/A    | DDR4                        | 32.000 GiB  | 0x1101    | A2
 N/A    | DDR4                        | 32.000 GiB  | 0x1102    | A3
 N/A    | DDR4                        | 32.000 GiB  | 0x1103    | A4
 N/A    | DDR4                        | 32.000 GiB  | 0x1104    | A5
 N/A    | DDR4                        | 32.000 GiB  | 0x1105    | A6
 N/A    | DDR4                        | 32.000 GiB  | 0x110c    | B1
 N/A    | DDR4                        | 32.000 GiB  | 0x110d    | B2
 N/A    | DDR4                        | 32.000 GiB  | 0x110e    | B3
 N/A    | DDR4                        | 32.000 GiB  | 0x110f    | B4
 N/A    | DDR4                        | 32.000 GiB  | 0x1110    | B5
 N/A    | DDR4                        | 32.000 GiB  | 0x1111    | B6
 ipmctl show -memoryresources
 MemoryType   | DDR                 | PMemModule   | Total
=========================================================================
 Volatile     | 383.000 GiB         | 0.000 GiB    | 383.000 GiB
 AppDirect    | -                   | 1512.000 GiB | 1512.000 GiB
 Cache        | 0.000 GiB           | -            | 0.000 GiB
 Inaccessible | 17179868801.000 GiB | 5.066 GiB    | 17179868806.066 GiB
 Physical     | 0.000 GiB           | 1517.066 GiB | 1517.066 GiB
ipmctl show -dimm
 DimmID | Capacity    | LockState        | HealthState | FWVersion
======================================================================
 0x0001 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x0011 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x0021 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x0101 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x0111 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x0121 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x1001 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x1011 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x1021 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x1101 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x1111 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446
 0x1121 | 126.422 GiB | Disabled, Frozen | Healthy     | 01.02.00.5446

Running ipmctl show -a -dimm I am able to see a serial for each PMEM DIMM. Doesn’t really match up with the serials provided in TrueNAS though? Unless there’s some conversion of this format to something else as displayed in TrueNAS. Again I am not well versed in how any of this PMEM stuff works.

SerialNumber=0x00001711
SerialNumber=0x00009a56
SerialNumber=0x00004745
SerialNumber=0x0000ad30
SerialNumber=0x000066e2
SerialNumber=0x0000335e
SerialNumber=0x0000a176
SerialNumber=0x0000107a
SerialNumber=0x000067ba
SerialNumber=0x000060ef
SerialNumber=0x000096b4
SerialNumber=0x00000c3d

This is a manual for a Supermicro board, not Dell as far as I can tell. X12PSx would refer to motherboards like the “X12SPL-F” from SM which only has eight DIMM slots. Thus it would make sense that it only discussed up to four as that would be the maximum you could use in that board. That would also be for a different PMEM series as it’s an Ice Lake board if that even matters.

Here is the Dell documentation on 100 series PMEM:

Okay, I wasn’t clear. The idea was for you to run those commands on your TrueNAS box as well.

Fair enough.