Badblocks recommended for running 6 x 24Tb HDD's? (First TrueNAS)

I bought 6 x 24TB WD Ultrastar DC HC580 WUH722424ALE604 0F62798 24TB 7.2K RPM SATA 6Gb/s 512e refurb drives from ServerPartDeals

I ran short and long SMART tests, everything looks OK.

Is it also highly recommended to run badblocks? Or should I just setup pool and do Scrubs?

----------------------------------------------------

Specs:

  • Case: Cooler Master HAF 922

  • CPU: AMD Ryzen PRO 4750G

  • Motherboard: ASRock B550 Pro4

  • RAM: 64GB UDIMM ECC (2 x 32GB Kingston KSM26ED8/32HC 2666 CL19 ECC 288 PC4)

  • NAS HDD Hard Drives: 6 x WD Ultrastar DC HC580 WUH722424ALE604 0F62798 24TB 7.2K RPM SATA 6Gb/s 512e 3.5in Recertified Hard Drive

  • Mirrored Boot OS SATA SSDs: 2 x Intel SSD DC S3700 200GB (used enterprise gear)

  • Fans

    • Front: Noctua NF-A20 PWM (200mm x 30mm Fan)

    • Rear: NF-A12x25 G2 PWM (120mm x 25mm)

    • Bottom: NF-A14x25 G2 PWM

    • PCI-e Cooling: Noctua NF-A9 PWM 92mm Fan

  • HBA Card: LSI 9305-16i

  • PSU: Corsair RM850x

Definately run a very thorough disk test before putting disks into production. Especially used disks. Scrubs only look at used space, and “new” disks has no used space

Badblocks is a useful (if tedious) way of doing this. You can do them all at the same time, but this will take the better part of a week

Remember, do this under tmux and change the block size to 4096.

3 Likes

Possibly Rather even 8k blocks: -b 8192
A side effect is that you’ll be testing your cooling through one week of thermal stress…

Speaking of which:

The Mirrored Boot Overkill Monster has struck again!
May I suggest that you switch to a (single) small NVMe drive for boot, move the six HDDs to the six motherboard ports (perfect match), and do without the HBA? One less part to cool.

4 Likes

Yes. There’s a script to handle it, including choosing an appropriate block size, putting it all into a nice tmux session for you:

6 Likes

Thank you @NugentS @etorix @dan

I forgot to mention - my drives are outside of RMA (I bought them back in Nov when prices were starting to rise, but had issues building my first NAS, I didn’t get it to finally run until this week).

If that’s the case, should I still run badblocks?

I bought the drives from ServerPartDeals.

Yes, you want to know the condition of the drives even if you will be unable to get them replaced with a warranty. A scrub will only look at parts in use, unused sectors will not see any testing.

But shouldn’t you still have some form of warranty? I thought SeverPartDeals offered warranties and the drives were, by your description, bought less than 6 months ago.

2 Likes

The primary purpose of running it is to detect drive failure before committing data to the drive; whether you can return it really is irrelevant (though obviously it’s nicer if you can).

1 Like

Thanks - you’re right about the warranty I believe. I will test with badblocks.

First going to increase the baseline fan speed to constantly run higher, as the drives went from 35C idle to 43C - 46C (depending on the drive) during the long SMART test.

I believe I can’t tie the HDD temps to a certain fan(s), but I’m all ears if you guys have any ideas in regards to the fans, besides just increasing the baseline speed.

PS - Would a simple Full Write Pass , and then running another long SMART test after, be sufficient in my case? Or badblocks much more recommended?

Your drives may get hot. Try to not let them exceed 50C. I don’t know off the top of my head the max temp for your drives are bit 60C is a typical value. At 57C I’d likely shut it down until i could get better cooling. Now this just my opinions so take it as just a point of data.

1 Like

Now debating on which exact test to run.

Supposedly this one is a good middle ground, would take 2-3 days for each batch of 3 drives tested.

Would running this badblocks test, that’s ~2 passes, be a good solution? Or what do you guys think?:

  • badblocks -nsv (badblocks non-destructive)

  • 3 drives at a time (out of 6 total, to prevent overheating system)

Exact command I’d run (for 3 drives):

badblocks -nsv -b 4096 -c 65536 /dev/sda &
badblocks -nsv -b 4096 -c 65536 /dev/sdb &
badblocks -nsv -b 4096 -c 65536 /dev/sdc &

Naw, run the full 4 destructive passes & give it the full week man. If you already have data on he disks move it off before hand.

You’ve had the drives for 6 months already. Give it another ~1.5 weeks for full testing before deploying them.

2 Likes

Your questions make you sound very reluctant to run badblocks. Why is this?

It’s not that badblocks is a perfect tool–it isn’t really even designed to do what we use it for. But your apparent reluctance is curious, to say the least. Run the test, all drives at once, turn up the fan speed if you need to.

4 Likes

and remember to do it under tmux

4 Likes

One of many problems with Very Large Capacity Drives is also the best feature, they store a lot of data. In order to test it completely, badblocks (all 4 passes) should run, but those high capacity drives will take a very long time. If you do not have an UPS, you might consider getting a good quality UPS with reasonable runtime. This will help with temporary outages (less than 10 minutes) but while testing these drives, if you have a power issue causing the computer to reset, then the test needs to start all over again.

IF, your data is not very important to you, you could run a single pattern and then cross your fingers that all is actually good. This kind of thing has been suggested before, I personally prefer the entire 4 test patterns, and there is a reason for using these test patterns. If it could be done with 2 test patterns, we would all do it.

We are not trying to force your hand, we are just trying to give you the best advice we can. Ultimately it is your decision. That is like someone choosing between ECC RAM or Non-ECC RAM. We will all tell you ECC is better and there are reasons, but it is up to you.

And I will reiterate @NugentS, remember to do it under tmux. If you don’t know how to use tmux, Google is your friend here. I used it last week for tmux, because I have used tmux twice now in several years and I needed to know the commands and what they did.

3 Likes

Thank you all for your tips, I will hook up the UPS I got for it, and then run the full badblocks test, using tmux.

A few more questions:

  1. I’m not sure if the fan curves I set inside BIOS are actually kicking in. Is there a way to monitor the speed of all 4 of my fans, from within truenas?

    sensors command wasn’t picking up fan speeds, in shell.

    (And I haven’t installed my NVMe I will use for apps yet… not sure if I could install docker apps on the boot pool? Confused on how to go about this).

  2. Is it ok to run badblocks on all 6 of my drives, simultaneously? (24TB WD Ultrastar HD580’s)

  3. Commands I will run for badblocks (please let me know if this is not ideal / incorrect)

    1. SSH into TrueNAS
    2. start tmux: tmux new -s burnin
    3. create 6 panes for my 6 HDDs
      1. Ctrl + B, then %
      2. Ctrl + B, then "
      3. repeat until 6 panes
      4. move between panes: Ctrl + B + arrow keys
    4. for each pane, run this command: badblocks -b 8192 -c 8192 -wsv /dev/sda | tee /root/sda.log
      1. (replace sda with sdb, sdc, etc, for each HDD)
    5. detatch: Ctrl + B, then D
    6. come back later: tmux attach -t burnin
    7. I will be periodically monitoring temps via TrueNAS GUI

Short answer is “you can’t install anything else on the boot pool”, long answer is “you shouldn’t install anything else on the boot pool, but if you understand the risks & know your cli, then it is possible”.

Setup apps once you got your nvmes in place.

1 Like

Yes if you use tmux as earlier suggested:

“Possibly even 8k blocks: -b 8192
A side effect is that ypu’ll be testing your cooling through one week of thermal stress…”

2 Likes

…or just use the script I linked up-thread, which puts it under tmux automatically.

2 Likes

If your board does not have IPMI, and I think you do not have a server board, then I don’t think you will be able to monitor the system fans, but I don’t know everything. As for the BIOS fan curve, you typically pick a sensor to monitor and the fans respond based on that. On my ASRackRock B650D I can pick each fan individually in the BIOS and set it up to different sensors. I can also set it to Full speed if I desire. For the hard drive burn-in, I would set your case fans to full speed. The drives will heat up, especially if they are close together. The ones in the center you should monitor for temp. After 2-3 hours, it probably will not get any hotter.

1 Like

Update: I had to increase block sizes to 8192 to get it to work. Will that make the test less accurate?

Ran this command to get it to work:

badblocks -b 8192 -c 8192 -wsv /dev/sdX | tee /root/sdX.log

When using badblocks -b 4096 -c 8192 -wsv /dev/sda | tee /root/sda.log I was getting the error: Value too large for defined data type … must be 32-bit value