Scrub Task not starting the Scrub

Hello,

on my TrueNAS SCALE Dragonfish-24.04.2.1 a Scrub Task is configured as shown in the screenshot attached.

In /var/log/cron.log I can see that “midclt call pool.scrub.run tank_titan_001 35 > /dev/null 2> /dev/null)” was called at 00:00:01 on Sunday. My system with the configured Scrub task is now running for roughly two months, but the TrueNAS dashboard still shows “Last Scrub: Never”. So, why is the Scrub not started? What could I check to see the reason for not starting the Scrub?

The documentation at https://www.truenas.com/docs/scale/scaleuireference/dataprotection/scrubtasksscreensscale/ states for the “Threshold days”: “Enter the number of days before a completed scrub is allowed to run again.” Does this mean that the very first Scrub will start even if there was no other Scrub before or does it wait for the threshold days since the previous Scrub which might not exist?

Thanks a lot in advance,

Thomas

Hi

Take this assumption with a grain of salt.
I have had (on Core) problem on the calc of the threshold, with the opposite result (they triggered everytime despite the threshold). Maybe, somehow related, the absence of previous scrubs in your case is preventing the right trigger of the job.
I would try to launch a manual scrub, for see if the dashboard still show the same result… And in case, if trigger the scrub after 35 days according your job setup

What is the output of zpool status and please use the “</>” tags to maintain the format of the data.

  pool: boot-pool
 state: ONLINE
  scan: scrub repaired 0B in 00:00:17 with 0 errors on Fri Oct 18 03:45:18 2024
config:

        NAME        STATE     READ WRITE CKSUM
        boot-pool   ONLINE       0     0     0
          mirror-0  ONLINE       0     0     0
            sdh3    ONLINE       0     0     0
            sdi3    ONLINE       0     0     0

errors: No known data errors

  pool: tank_titan_001
 state: ONLINE
config:

        NAME                                      STATE     READ WRITE CKSUM
        tank_titan_001                            ONLINE       0     0     0
          raidz3-0                                ONLINE       0     0     0
            9d81d9b3-512d-45b6-bc45-1065ec4e5272  ONLINE       0     0     0
            16fff3e8-cf8d-4006-a154-5a7383df3373  ONLINE       0     0     0
            63e9bd52-bc52-425b-9e48-d2412d29b66a  ONLINE       0     0     0
            cb510ff5-bc32-43d6-848a-81b4f1faf149  ONLINE       0     0     0
            7813ee4a-ed79-40cc-97e8-90f0620a8c22  ONLINE       0     0     0
            6fae3949-5492-45d8-bdb8-15830a26abe3  ONLINE       0     0     0
            91e4b791-593b-48b1-a2de-a7655510c190  ONLINE       0     0     0

errors: No known data errors

I am wondering about the “scrub repaired 0B in 00:00:17 with 0 errors on Fri Oct 18 03:45:18 2024” while the TrueNAS dashboard says “Last Scrub: Never”. It is also strange that the scrub ran on Friday 03 o’clock and not sunday 0 o’clock.

Any ideas?

Well, not the result I was expecting. Let’s try something else:

  1. Run the command zpool scrub tank_titan_001
  2. Wait 10 minutes (or a little longer)
  3. Run the command zpool status tank_titan_001 and post those results.

We are looking for a report similar in format to:

 state: ONLINE
status: Some supported and requested features are not enabled on the pool.
        The pool can still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
        the pool may no longer be accessible by software that does not support
        the features. See zpool-features(7) for details.
  scan: scrub in progress since Sun Oct 20 11:08:45 2024
        2.18T / 6.04T scanned at 36.6G/s, 199G / 6.04T issued at 3.27G/s
        0B repaired, 3.22% done, 00:30:32 to go
config:

Hopefully the scrub will start.

Here it is.
The Scrub started.

What’s the next step to find why the “Scrub Task” did not work?

 zpool status tank_titan_001
  pool: tank_titan_001
 state: ONLINE
  scan: scrub in progress since Sun Oct 20 18:01:34 2024
        47.7T / 47.7T scanned, 915G / 47.7T issued at 1.28G/s
        0B repaired, 1.87% done, 10:24:41 to go
config:

        NAME                                      STATE     READ WRITE CKSUM
        tank_titan_001                            ONLINE       0     0     0
          raidz3-0                                ONLINE       0     0     0
            9d81d9b3-512d-45b6-bc45-1065ec4e5272  ONLINE       0     0     0
            16fff3e8-cf8d-4006-a154-5a7383df3373  ONLINE       0     0     0
            63e9bd52-bc52-425b-9e48-d2412d29b66a  ONLINE       0     0     0
            cb510ff5-bc32-43d6-848a-81b4f1faf149  ONLINE       0     0     0
            7813ee4a-ed79-40cc-97e8-90f0620a8c22  ONLINE       0     0     0
            6fae3949-5492-45d8-bdb8-15830a26abe3  ONLINE       0     0     0
            91e4b791-593b-48b1-a2de-a7655510c190  ONLINE       0     0     0

errors: No known data errors

Addition:
The Scrub finished and the TrueNAS dashboard now showws “Last Scrub: 2024-10-21 07:56:53”.

zpool status gives:

zpool status tank_titan_001
  pool: tank_titan_001
 state: ONLINE
  scan: scrub repaired 0B in 13:55:19 with 0 errors on Mon Oct 21 07:56:53 2024
config:

        NAME                                      STATE     READ WRITE CKSUM
        tank_titan_001                            ONLINE       0     0     0
          raidz3-0                                ONLINE       0     0     0
            9d81d9b3-512d-45b6-bc45-1065ec4e5272  ONLINE       0     0     0
            16fff3e8-cf8d-4006-a154-5a7383df3373  ONLINE       0     0     0
            63e9bd52-bc52-425b-9e48-d2412d29b66a  ONLINE       0     0     0
            cb510ff5-bc32-43d6-848a-81b4f1faf149  ONLINE       0     0     0
            7813ee4a-ed79-40cc-97e8-90f0620a8c22  ONLINE       0     0     0
            6fae3949-5492-45d8-bdb8-15830a26abe3  ONLINE       0     0     0
            91e4b791-593b-48b1-a2de-a7655510c190  ONLINE       0     0     0

errors: No known data errors

Isn’t there something of a conflict between a threshold of 35 days and a weekly scrub schedule?

1 Like

Why do you see a conflict? The cron job runs once every week and checks if since the last run of the Scrub at least 35 days have passed. Only if this time has passed, the Scrub is started, otherwise the cron job will wait one week before checking again.

2 Likes

So, if the scrub trigger correctly from now on (after 35 days of threshold) really seems that the absence of previous scrub impact the scheduler somehow.
I would open a bug request in case

I have re-configured my Scrub Task to see if it will correctly trigger the next days. If it does, I will issue the bug report.

Thanks a lot for all your help.

When my newly created auto scrub didn’t run, I ran a manual scrub, set the threshold at 5, the scheduled weekly auto scrub executed, and then I messed with the threshold value.

After the manual trigger of the initial srub as proposed by @joeschmuck the Scrub Task seem to work now.

I have created a bug ticket regarding this issue:
https://ixsystems.atlassian.net/browse/NAS-132007

Hint:
There is another thread having the same issue:
https://forums.truenas.com/t/scrub-tasks-not-working-for-one-pool/21615

There’s a bug in TrueNAS Core 13.3 that issues a scrub on the boot-pool every day, no matter what it’s set to.

This Is the thread i was referring. Im launching scrub manually -.-
Don’t seems happening to everyone.

:cry:

When you build a new system then there is no need to run a scrub - right?

So for a fresh system it is my understanding that the first scrub will occur after the set threshold of days has been reached. :slight_smile:

That’s absolutely correct.

This is also the understanding of the other user’s, but TrueNAS SCALE behaves differently and does nothing until a first scrub was triggered manually. Even after the threshold elapsed, the scrub is not started by the scrub task.

1 Like

I also ave RC2 on a fresh system here which is just 14days old, so it did not meet the threshold yet.
I have adjusted the threshold now so that it should scrub in 6 hours. I will get back to you - maybe I can confirm the issue. :slight_smile:

I cannot confirm the behaviour you have seen - but I am on 24.10 RC2.

I have a fresh 24.10 RC2 system here which has not yet met the threshold of 35 days to start the scrub.

I have 2 pools:
1x SSDs
1x HDDs

I changed the threshold of the scrub job for the SSD pool to “0” days
and the threshold of the scrub job fo the HDD pool to “10” days.

Both were then started as planned/scheduled.