Hello,
I’m new to TrueNAS (and any NAS or even linux…)
Build a home server, installed TrueNAS (now updated to 25.10.0)
configured the SMB shares, copied ~1TB of family media, all was good…
Then installed Jellyfin, PiHole.. looked all good. Copied few TV shows for Jellyfin Media, also had no issues. After few days copied some more series, and later noticed and error at Storage that ZFS Health - “Pool is not healthy”. Tried the scrub, was not gone after,
Check the zpool status and found multiple Checksum Errors for the newly copied TV shows (multiple random files)
So I have deleted them, and copied again from my PC. Reran scrub, and the health returned for the pool.
Thought it was just one time thing… But not… Happened again, with some other files (some Game mods I think…)
So I have tried to investigate the issue.
Ran SMART tests for the HDDs - no errors.
Tried to monitor checksums before and after some test files, and sometimes random file got corrupted after copying to NAS.
Then tried read local read test, to avoid network part of the equation,
ran this script:
LOGDIR=/mnt/tank/shared/NAS_Test
OUT=“$LOGDIR/read-stress.log”
ERR=“$LOGDIR/read-stress-errors.log”
PIDFILE=“$LOGDIR/read-stress.pid”
echo $$ > “$PIDFILE”
echo “=== READ STRESS START $(date) ===” >> “$OUT”
echo “=== READ STRESS START $(date) ===” >> “$ERR”
Loop forever reading every file one by one
while true; do
find /mnt/tank/shared/NAS_Test -type f -print0 | while IFS= read -r -d ‘’ f; do
echo “READ $(date +%Y-%m-%dT%H:%M:%S) $f” >> “$OUT”
try reading with dd; dd returns non-zero on I/O error
if ! dd if=“$f” of=/dev/null bs=1M status=none conv=sync 2>>“$OUT”; then
echo “ERR $(date +%Y-%m-%dT%H:%M:%S) $f” >> “$ERR”
echo “---- dmesg tail at $(date) ----” >> “$ERR”
dmesg | tail -n 80 >> “$ERR”
echo “---- end dmesg ----” >> “$ERR”
fi
done
done
And the result was, that sometimes some random file got error when reading (Input/output error). But then the same file was passing without error, so that would suggest, that the reading process got the error, not the file it self is corrupted.
Then tried local write test for each of the HDDs with this
sudo nohup dd if=/dev/sda of=/dev/null bs=1M status=progress > /mnt/tank/shared/NAS_Test/dd-sda.log 2>&1 & echo $! | sudo tee /mnt/tank/shared/NAS_Test/dd-sda.pid
(and sdb for the second)
ran them separate, - no error, then ran together till reach 2TBs each… no errors…
So this would mean, that each of the disk write have no issue, only when using ZFS?
And now, I do not know what to do next? How to find the root cause of this?
My hardware:
CPU: INTEL Core i5-14500
MB: ASUS PRIME B760M-A WIFI D4
RAM: G.Skill | Ripjaws V | 32 GB | DDR4 | 3600 MHz | CL16
OS SSD: Samsung 970 evo 500GB
Storage HDDs: 2x SEAGATE NAS HDD 8TB IronWolf 7200rpm ( VDEV created with Mirror layout)
PSU: be quiet! Pure Power 13 M | 850W
Any suggestions (as detailed as possible) would be appreciated.
