Best practices on non-ECC system

darkbouny · October 16, 2024, 1:40pm

Whaooo, I was not expecting that mix of technical and philosophical answers !

Finally, I found that on ebay (and bought it)

That guy will perfectly fit in a Jonsbo N2 running 4xHDD and 1nvme for boot

Thanks All for all your comments and thoughts

etorix · October 16, 2024, 1:52pm

One SATA port short for the N2, but nice find at this price. Compared with the MJ11-EC1, this one has a x16 slot to add a faster NIC if needed (it is M11SDV-8CT-LN4F unfortunately, not M11SDV-8C-TLN4F), or 4*M.2 on riser.

I’m curious about your experience with cooling drives in the N2, by the way. A 120x15 mm thin fan might have a hard time pulling enough air behind a backplane.

darkbouny · October 16, 2024, 2:07pm

There was only one board of that type available from France seller.
I don’t care of the missing 5th SATA port, i don’t plan to use SSD.

For cooling, well, let see. As the case will stay in the basement, i can run it full speed if needed. I’ll tell you in couple of weeks

darkbouny · October 16, 2024, 2:09pm

you mean PCI to 4xM.2 adapter/riser ?

etorix · October 16, 2024, 3:57pm

Yes, something like this.

Arwen · October 16, 2024, 4:53pm

Check with the slot, CPU & system board. Those cheaper multiple M.2 NVMe PCIe adapters require breaking up the 16x lanes in to pieces, (aka bi-furcation).

Some slots, CPUs and system boards may not support bi-furcation, so a more expensive card that has a PCIe switch chip on it is required to go past 1 M.2 NVMe drive.

RetroG · October 16, 2024, 5:35pm

Don’t Ryzen (not threadripper) systems usually require ECC in UDIMMs?

ECC UDIMMs are usually fantastically overpriced given they are just including a 9th chip per rank.

garyez_28558 · October 16, 2024, 6:06pm

Have noticed that Synology, Asustor, and other pre-built NAS offerings use Non-EEC RAM in their products for consumers and small businesses. Of course, they do not use ZFS but rather proprietary RAIDs which I guess are based on BTRFS. Would like to know why the ECC hardware difference? Is it due to ZFS’s scalabilty, design for enterprise, or data security? Or is it Truenas’s implementation of ZFS? I’m not an IT pro so please make it simple.

Farout · October 16, 2024, 6:11pm

At least with Synology, even some lower end “+” models do use ECC. Most ppl dont know what they are buying. They see “NAS”, and they think their data is safe.

etorix · October 16, 2024, 6:29pm

More expensive than the non-ECC counterpart, and less available than RDIMM on the second-hand market, but “fantastically” seems excessive. Going for AM4 with the MC12-LE0 also implies going for DDR4, which is less expensive than DDR5.

ECC benefits any operating system, and any storage system.
ZFS was designed for enterprises, where ECC is de rigueur, ZFS code assumes—among other things—ECC RAM, and anyone who’s paranoid enough^[1] to use ZFS should respect its hardware requirements.

cares enough about his data… ↩︎

argumentum · October 16, 2024, 7:07pm

…and windows maps the memory area that shown to be bad as “don’t use” and PCs with windows are cheaper. And we have shit for memory.
As far as I remember. Can’t confirm

oxyde · October 16, 2024, 7:13pm

In Italy, 16gb of unbuffered DDR4 with ecc are sell/available for an AVG of 200€ each stick. Price can drop, but hard to find something for less than 100~120€. And, off course, can be high.
Used? You have to be really lucky to find something… And price are still bad.
A germany eBay seller there save my finance… Or really with those price i probably have change totally system

etorix · October 16, 2024, 7:20pm

Ouch! That makes a strong incentive to go for RDIMM…

Stux · October 16, 2024, 7:46pm

Be warned that guy is designed for server airflow.

Ie high airflow across the board front to back.

RetroG · October 16, 2024, 7:47pm

doing some basic math… ignoring PCB and passive costs which are negligible.
including a ninth chip increases the price 12.5% in parts cost, any markup beyond that is a bit much imo.

Stux · October 16, 2024, 7:51pm

See. It’s basically 9-way Raidz1 instead of an 8-way stripe.

And that should be the end of it

etorix · October 16, 2024, 8:01pm

As for X10SDV boards, a 40 mm fan strapped to the heatsink will take care of that.

Arwen · October 16, 2024, 9:35pm

One thing that I have “guessed”, is that some of the people who have corrupted ZFS pools here, AND using Non-ECC Memory, “may” have been caused by memory errors. One reason I say this, is that these people used reasonable hardware, (aka no hardware RAID, no virtualizing TrueNAS, etc…) and yet got pool corruption.

With the TONS of data now being stored on ZFS, it is interesting that big shops don’t seem to get corrupted pools. Or at least don’t mention it publicly. And won’t stop using ZFS, which would happen if they lost data big time.

In other cases where a user had hardware RAID involved and got a corrupted pool, it is clearer that the hardware RAID was “likely” the cause. Doing postmortem on such a pool is beyond what a discussion forum can do. The general consensus is to rebuild the pool without hardware RAID and restore a pool from backups. Perhaps attempt some recovery from the corrupted pool before the rebuild.

ZFS was specifically designed and maintained to remain consistent on disk at all times. A write either succeeds all the way. Or a write is dropped during power loss or OS crash. That is basically all that can happen.

To be clear, some hardware RAID, underlying software drivers, virtualizing TrueNAS and bad firmware on disks can cause problems. ZFS makes a valiant attempt to perform it’s writes in proper sequence to maintain on disk consistency. But, hardware faults, like memory errors or the problems listed above, can impact a ZFS pool’s integrity.

So, it is my humble opinion that we HAVE seen ZFS pool corruption due to memory errors. But, it is an opinion not backed up by facts. Just the preponderance of corrupted ZFS pools without other indicators like hardware RAID.

One last note, memory errors DO corrupt other file systems. So if you care about your data, always use reliable hardware WITH ECC Memory. Regardless of OS or file system.

Stux · October 16, 2024, 9:40pm

And the corruptions percolate silently to backup too. Which is the worst part.

Arwen · October 16, 2024, 9:45pm

Yes.

However, whence a ZFS pool is corrupt, then the user has little choice.

On the other hand, if the silent corruption is in the file data, meaning a bit was flipped in the file blocks before check-summing and writing, but is not metadata, then you have an undetectable corrupt file. NOT pool corruption.

Basically pool corruption can only occur if bad data is written to metadata or critical metadata. If bad data is written to a file, it does not affect importing or scrubbing of the pool.