ZFS checksum errors

Checksum errors can be due to disk hardware, but more often they relate to disk controller errors or overheating, power or SATA cable connections, PSU issues or memory issues, and reseating memory sticks, PCIe cards and power/SATA cables can often stop them for continuing to occur.

After reseating the memory run a memory test for a few hours.

Then do a sudo zpool clear poolname for the pool experiencing errors to reset the error counters and see what happens.

Actually my standard list has evolved and is now:

  • lsblk -bo NAME,LABEL,MAJ:MIN,TRAN,ROTA,ZONED,VENDOR,MODEL,SERIAL,PARTUUID,START,SIZE,PARTTYPENAME
  • sudo ZPOOL_SCRIPTS_AS_ROOT=1 zpool status -vLtsc lsblk,serial,smartx,smart
  • sudo zpool import
  • lspci
  • sudo sas2flash -list
  • sudo sas3flash -list
  • sudo storcli show all
  • for disk in /dev/sd*; do; sudo zdb -l $disk; done
  • for disk in /dev/sd?; do; sudo hdparm -W $disk; done
  • for disk in /dev/sd?; do; sudo smartctl -x $disk; done

though I normally remove any of these I don’t think will be helpful.

4 Likes