One or more devices has experienced an error resulting in data corruption. Applications may be affected

Hello. First off I would like to apologize for my lack of knowledge. While there are some things I know when it comes to PC’s, I don’t know everything. So some of my terminology may not be correct. I’m simply someone who wants to have a simple NAS on a budget. I know very little of linux, and I’m willing to understand more so I can help maintain this system.

I have setup a NAS with a Thinkcentre M910Q. There is a 2.5 SSD where the OS is installed as well as a 1TB m.2 drive installed. That is where my apps, files, and datasets are. The installed apps I have are Nextcloud, Cloudflare Tunnel, Tailscale, and Jellyfin. It’s setup for simple file sharing and media streaming. Not necessarily file backups. Although I hope to expand to something better later, so that I can use this as data backup.

I’m frequency experiencing an issue. Now the first thing I want to mention, is that the M.2 is not being held down properly. And yes, I am already taking measures to try and fix this. The mini PC that I have is not meant for a standoff and screw. I have ordered a plastic push-pin which will be arriving soon and hopefully stop this issue from occurring. And yes, I do realize that this could very well be causing all these errors and what I’m experiencing. I understand that all of this may be redundant given this. I am doing what I can for now, and until I have what I need to properly secure my m.2, here is the issue.

I have alerts setup to my email. Pretty much everyday, I’ll get the error “Pool “my pool name” state is ONLINE: One or more devices has experienced an error resulting in data corruption. Applications may be affected.” Ever since I got the message the first time, I logged into the web UI to see that the CPU averages at high ~95% usage. I would reboot it to see if all of my files were corrupted. Rebooting or shutting down via the web UI wouldn’t do anything. I would forcefully shut it off, reboot, and find that all my files are safe. A notification pops up saying that all of the previous errors have been cleared.

Today that error has occurred multiple times. Seemingly with no cause, not even any heavy work loads. On top of a new error. “Pool “my pool name” state is SUSPENDED: One or more devices are faulted in response to IO failures. The following devices are not healthy: ”My M.2 Drive”.

I ran zpool status -v during one time the error occured with this as the output.

Permanent errors have been detected in the following files:
/var/db/system/update/update.sqsh
/mnt/.ix-apps/app_mounts/jellyfin/config/data/jellyfin.db-shm

Another instance of having and error and running the same command resulted in this:
(Some of the characters are not exact and I apologize for that)

Permanent errors in
**mnt/.ix-apps/docker/container//mnt/.ix-apps/docker/containers/85e8175a59bb209e7c361214b6f5ded968f387a3deb5c0c6bb46b5b42c7a729e/85e8175a59bb209e7c361214b6f5ded968f387a3deb5c0c6bb46b5b42c7a729e-json.log

/var/db/system/netdata/dbengine/datafile-1-000000094.ndf

/var/db/system/netdata/journalfile-1-000000094.njf**

mnt/.ix-apps/app_mounts/jellyfin/config/data/fellyfin.db-shm

But it’s worth nothing that I’ve had the first error happen to me many times without any apps even installed and simply using the SMB service. I have never rain zpool status before today, and it’s my first time noticing the files affected. So I’m confused to see files referenced from jellyfin. It makes me concerned for what the actual problem may be.

It has been a cycle ever since. I have seen a few people online mentioning the possibility of faulty ram. So currently I’m running MemTest86. I have previously loaded my m.2 on a portable drive on my main PC and ran CrystalDiskInfo. The drive was reportedly healthy. Not too entirely sure if only using that software was the right move or conclusive enough to determine that.

You should post full details on your current hardware set up. Just looking at the data that shows for that models, I would say it is not fit for use with TrueNAS. https://www.lenovo.com/us/en/p/desktops/thinkcentre/m-series-tiny/thinkcentre-m910q/11tc1mt910q

It is wise to do tests on the CPU and RAM and test the base system stability What are the specific models for your data storage and OS? Some are just not fit for use with ZFS / TrueNAS.

The better the info and details you post on your system, the better the advice, usually. Have you run Long SMART tests on all your drives?

Sorry

Intel Core i5-6500t 2.5ghz

8gb ddr4 ram. Memtest shows 2132mt/s

I dont know the 2.5 ssd specs right now

But the m.2 is a 1tb teamgroup TM8FP6001T

I’ll get back to you on the truenas version. Im currently running memtest. But how should I go about testing the cpu? And I don’t know if ive run a long smart test. I do have a feeling that maybe I have, and it failed a long time ago. I will do that as soon as I can.

8GB is the minimum RAM per the documents. It may be okay for just a NAS and data share but you should have more memory to run applications and VMs, unless they are really light on resources.

We usually recommend booting to a Live Linux USB or ISO and running CPU and memory stress tests, like Prime 95? We try to rule out most of the hardware first.

Looking a running current Long SMART test may help if the drives are going bad. You can also try looking at the data on temperature on the drives as you could have an overheating issue, not sure on that, just a guess.

1 Like

Sounds good thank you.

I will look at getting another ram module to double my current size. I’ll flash a usb with prime95 and test cpu. And I’ll run a long smart test to see how that goes for the drives. As far as thermals, I think it should be okay.

I ran a long smart test on both the boot drive and storage drive. It took only a couple of seconds for the boot drive and takes just a couple seconds longer on the m.2 drive. Is that expected? Otherwise, they have both passed the test But I got another error anyway. Here’s the full output

truenas_admin@truenas[~]$ sudo zpool status -v

state: ONLINE

status: One or more devices has experienced an error resulting in data

corruption. Applications may be affected.

action: Restore the file in question if possible. Otherwise restore the

entire pool from backup.

see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A

scan: scrub repaired 0B in 00:08:49 with 0 errors on Wed Apr 15 19:43:25 2026

config:

NAME STATE READ WRITE CKSUM

Storage ONLINE 0 0 0

244cd01b-9b41-47aa-8990-32c876f81a7b ONLINE 258 45.6M 0

errors: Permanent errors have been detected in the following files:

:<0x0>

:<0x19a>

/mnt/.ix-apps/metadata.yaml

/mnt/.ix-apps/docker/buildkit/history.db

/mnt/.ix-apps/docker/volumes/metadata.db

/mnt/.ix-apps/docker/overlay2/42328042cb2ebb71ef484273d96167e5c16a5a99a7a9b91569696cd889c91e27/diff/usr/local/bin/cloudflared

/mnt/.ix-apps/docker/overlay2/160977d32a26f6825530a06400b6781a64597ea9133817a5f87506a957c77998/diff/usr/bin/curl

/mnt/.ix-apps/docker/overlay2/d18a14b7d87d5958c9a521e7f21568b1d4435038585d2077c50665c3ea7ac420/diff/usr/bin/curl

/mnt/.ix-apps/docker/overlay2/65f7aebdd2c23fd3a4a9f8f6103689c9c6824625e1049ede51440a3d27924aa4/diff/usr/lib/locale/locale-archive

/mnt/.ix-apps/docker/overlay2/5e31d4da83ade88225176aa7e0bcb2c8111eba0f071c8cfe28f056a5ada20392/diff/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2

/mnt/.ix-apps/docker/overlay2/6312617ab40210a46e0c9d763c8f572f4ff554b632617d3dc954046899d55ae5/diff/usr/local/bin/tailscaled

/mnt/.ix-apps/docker/overlay2/b71a447a8431ed2f860cdbf1db76d79735ab6e7ceeb7d9a73cb5eaf16209c354/diff/usr/lib/x86_64-linux-gnu/libc.so.6

/mnt/.ix-apps/docker/overlay2/b71a447a8431ed2f860cdbf1db76d79735ab6e7ceeb7d9a73cb5eaf16209c354/diff/usr/lib/x86_64-linux-gnu/ld-linux-x86-64.so.2

/mnt/.ix-apps/docker/overlay2/6312617ab40210a46e0c9d763c8f572f4ff554b632617d3dc954046899d55ae5/diff/usr/local/bin/containerboot

/mnt/.ix-apps/docker/overlay2/5a659a35cadde7b0719652a54b6fb48e6751bbaf2924e38487b035f7032a866a/diff/usr/lib/postgresql/18/bin/postgres

/var/db/system/update/update.sqsh

pool: boot-pool

state: ONLINE

scan: scrub repaired 0B in 00:01:06 with 0 errors on Thu Apr 9 03:46:08 2026

config:

NAME STATE READ WRITE CKSUM

boot-pool ONLINE 0 0 0

sda3 ONLINE 0 0 0

errors: No known data errors

Ssd & nvme smart tests are indeed very fast.

Anything of interest in dmesg? When the error occures is the drive still detected in Truenas? If it is, how much space does it show available? (Reason I ask is because I’ve sometimes has nvme go into very low power state mode & get stuck in there, basically reporting 0b available & making truenas think the drive totally dropped)

dmesg output.txt (129.9 KB)

It’s pretty much repeating the same thing. I don’t know what it means, but I was able to get dmesg output last night when one happened. TrueNAS does still detect the drive and usually says the pool is still online.

This is a newer one. But the Total ZFS errors also sky rocket whenever the error happens. I login shorlty to see 34 million errors. Only after about a minute of waiting, it’ll say 174 million. But then again, rebooting - everything’s fine. I don’t know what’s going on

1 Like

I would just power it all off and fix your M.2 mounting issue first. Do you have everything backed up? You are risking it all by keeping it online.

How many passes of Memtest85. Usual recommendation is 5 or more. What you describe is usually hardware issues.

1 Like

Good call, I will keep it off. I actually had an idea to put in another m.2 and run dmesg in a live linux USB to see if I get the same errors. Otherwise, I will keep it off until I get my pushpin. I also have more ram coming in

The memtest did 4 passes with no errors.

[ 9856.255434] pcieport 0000:00:1b.0: AER: Correctable error message received from 0000:00:1b.0
[ 9856.255446] pcieport 0000:00:1b.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
[ 9856.255455] pcieport 0000:00:1b.0:   device [8086:a2eb] error status/mask=00000001/00002000
[ 9856.255461] pcieport 0000:00:1b.0:    [ 0] RxErr                  (First)
[ 9858.140441] pcieport 0000:00:1b.0: AER: Correctable error message received from 0000:00:1b.0
[ 9858.140454] pcieport 0000:00:1b.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
[ 9858.140463] pcieport 0000:00:1b.0:   device [8086:a2eb] error status/mask=00000001/00002000
[ 9858.140469] pcieport 0000:00:1b.0:    [ 0] RxErr                  (First)
[ 9859.404057] pcieport 0000:00:1b.0: AER: Correctable error message received from 0000:00:1b.0
[ 9859.404070] pcieport 0000:00:1b.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
[ 9859.404078] pcieport 0000:00:1b.0:   device [8086:a2eb] error status/mask=00000001/00002000
[ 9859.404084] pcieport 0000:00:1b.0:    [ 0] RxErr                  (First)
[ 9870.835014] pcieport 0000:00:1b.0: AER: Correctable error message received from 0000:00:1b.0
[ 9870.835026] pcieport 0000:00:1b.0: PCIe Bus Error: severity=Correctable, type=Physical Layer, (Receiver ID)
[ 9870.835035] pcieport 0000:00:1b.0:   device [8086:a2eb] error status/mask=00000001/00002000
[ 9870.835041] pcieport 0000:00:1b.0:    [ 0] RxErr                  (First)

I’m gonna guess that pcieport 0000:00:1b.0 is your nvme - the fact you mentioned it isn’t mounted properly & that you’re getting pcieport errors… I’m thinking that @SmallBarky is right & that this is just a mounting fault causes poor contact & occasional drop outs.

So I ran a live linux usb.

Once with the 1tb nvme

And with another one I had thats 500gb.

I ran dmesg | grep pcieport on both separately.

The 1tb nvme had all the same errors.

The 500gb had no errors outputting. So its the drive thats faulty. I will probably be getting a drive bay anyways

2 Likes

Sorry man - sounds like the worst case. At least you had another drive to compare to & same time investigating further.

Yeah. It was also a good experience to learn from. And it’s a good thing I already have backups of that data I was using on that drive. So there’s no real harm done.

No chance that there is a firmware update or something available as a last ditch effort? 1TB NVMEs are pretty expensive now…

From what I looked at online, theres no firmware update for it. This actually gives me an excuse to get a drive bay enclosure and hdd’s which is something I’ve been wanting to do

How are you planning on attaching the drive enclosure? If it was USB that in not recommended.
It may be a good time to look into other server options. See the Hardware Guide section of the online Documents for TrueNAS.

AVOID USB

1 Like

That was the plan. I’ll avoid USB enclosures. I would really like to stick tho this machine for budget. You got any options given my limitations? This is mainly a media server, file sharing, and slight backup usage.

You might get away with one disk attached by USB. We recommend NVMe or SSD in a USB adapter for boot-pool usage. I would try to stay away from enclosures that that multiple drives. You really don’t have much choices in expansion.
Stay away from SMR type HDs and only choose CMR type models. Usually NAS HDs are okay but check the model number specs before you buy.

SMR vs CMR ServeTheHome

TrueNAS Docs - WD Red SMR Drive Compatibility with ZFS