FANGTOOTH observations / issues

Demonstrates what, how? Genuinely curious, not being incitant/provocative. From my POV I can think of parallells with XFS which of course started as proprietary to SGI and Irix, then ported cross-platform, then became part of core in Linux and remains to date a very excellent fs under active development and complements ZFS very well.

Having a filesystem/storage manager that can cross OS boundaries is incredibly powerful not just for development but also for user base as this community demonstrates. We can argue all day long about the pros and cons around TrueNAS moving from FreeBSD to Linux but the fact is iX had that option because OpenZFS was essentially OS agnostic. The idea of locking it down to a single OS just because right now it’s been decided thats the ‘best OS’ I think would be a massive step backwards. I love the fact I can export my zpool anytime and import it into a variety of different OS flavours if I so choose.

It’s a beautiful idea I agree, and I wont argue against it at all, but such portability/flexibility comes with tradeoffs (complexity, risk of having to level to lowest-common denominator impacting features and performance, etc) whilst at the same time maybe not even practically feasible. Afaik the Windows and MacOS versions are quirky and lag behind (ZFS on OSX and Windows | ServeTheHome Forums) so not clearly realistic alternatives even if nice in principle. Whereas the FreeBSD version is now based off the Linux ”master” ZFS on Linux subsequently OpenZFS. As far as I have understood it anyway, the barriers to tighter integration are political and legal rather than anything else.

Yes I agree and you make some valid points. It does appear that ZFS on Windows and MacOS isn’t going anywhere therefore that leaves us with fewer options. It’s probably inventible where it ends up in the long run.

When they introduce a setting called ”com.apple.mimic” and you need to set it to ”ntfs” for the MacOS port to work, then yeah, you know there’s trouble ahead… But time will tell as you say.

2 Likes

Hello everyone :slight_smile:

I’m new to the forum, although I’ve been using FreeNAS for about 15 years.

I recently decided to upgrade my home NAS system, which was running on Intel Atom D510 + 4GB RAM and a very old version of FreeNAS (0.7.2 Sabanda - FreeBSD 7.3).

Due to the relatively slow processor and small amount of RAM, I did not upgrade to newer versions, and since the system did not have access to the Internet and only worked on my home network, issues related to security patches were not too important to me.

I have currently built a new NAS and installed TrueNAS version 25.04.01 to familiarize myself with it. So I would like to share my thoughts on this topic.

I myself have been dealing with larger IT systems and large computer networks for a long time. That is why at home I try to avoid making my life difficult and look for user-friendly and simple systems - that is why I chose FreeNAS…

Therefore, I agree with some suggestions to make the NAS system minimalistic, but you cannot go to extremes here because certain data is necessary for proper monitoring of the system. Here I have to agree with duecedriver that well-executed ZFS data analytics would be useful and could help detect problems early (before a disaster happens).

The same applies to the mentioned S.M.A.R.T information - unfortunately, compared to the older version of FreeNAS, I feel a regression here.

While testing the new version of TrueNAS, I inserted old drives into the NAS, one of which was damaged - I wanted to check if the system could handle error detection.

Unfortunately, I was in for an unpleasant surprise. The pool is healthy (until it hits bad sectors on a damaged disk), but the S.M.A.R.T tests of the damaged disk - both short and long are not performed correctly. After running the test from the WEB GUI, it stops at 90% even though the test actually finished and showed errors (which can be checked using the console).

Unfortunately, this behavior for people who do not know how to use the Linux console may lead to missing serious problems that we have with a damaged disk. IMHO in the case of a NAS system, such diagnostics should work reliably and effectively - precisely for the purpose of early detection of problems and preventing them. The aforementioned ZFS data analysis well presented in a GUI could be just as helpful in such cases…

Below I am attaching what it looks like from the WEB GUI level and what smartctl shows us (the test has been completed, although it is still running in the WEB GUI), as well as disk problems that were not reported in the WEB GUI.

The problem of course I reported the problem on the ixsystems.atlassian website…

However, it is worth considering implementing more thoughtful analytics. Of course, it does not have to contain a million pieces of information, but it should help avoid such cases, especially for less advanced people…

Welcome to TrueNAS
Last login: Wed Jun 11 08:03:54 CEST 2025 on pts/0
truenas_admin@truenas[~]$ sudo smartctl -a /dev/sdb 
[sudo] password for truenas_admin: 
smartctl 7.4 2023-08-01 r5530 [x86_64-linux-6.12.15-production+truenas] (local build)
Copyright (C) 2002-23, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar Green (AF)
Device Model:     WDC WD20EARS-00MVWB0
Serial Number:    WD-WCAZA3229886
LU WWN Device Id: 5 0014ee 25ab9d8ec
Firmware Version: 51.0AB51
User Capacity:    2,000,398,934,016 bytes [2.00 TB]
Sector Size:      512 bytes logical/physical
Device is:        In smartctl database 7.3/5528
ATA Version is:   ATA8-ACS (minor revision not indicated)
SATA Version is:  SATA 2.6, 3.0 Gb/s
Local Time is:    Wed Jun 11 08:55:37 2025 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x84) Offline data collection activity
                                        was suspended by an interrupting command from host.
                                        Auto Offline Data Collection: Enabled.
Self-test execution status:      ( 113) The previous self-test completed having
                                        the read element of the test failed.
Total time to complete Offline 
data collection:                (36180) seconds.
Offline data collection
capabilities:                    (0x7b) SMART execute Offline immediate.
                                        Auto Offline data collection on/off support.
                                        Suspend Offline collection upon new
                                        command.
                                        Offline surface scan supported.
                                        Self-test supported.
                                        Conveyance Self-test supported.
                                        Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                                        power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine 
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        ( 349) minutes.
Conveyance self-test routine
recommended polling time:        (   5) minutes.
SCT capabilities:              (0x3035) SCT Status supported.
                                        SCT Feature Control supported.
                                        SCT Data Table supported.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x002f   197   197   051    Pre-fail  Always       -       2031
  3 Spin_Up_Time            0x0027   237   189   021    Pre-fail  Always       -       3116
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       231
  5 Reallocated_Sector_Ct   0x0033   172   172   140    Pre-fail  Always       -       544
  7 Seek_Error_Rate         0x002e   100   253   000    Old_age   Always       -       0
  9 Power_On_Hours          0x0032   073   073   000    Old_age   Always       -       19872
 10 Spin_Retry_Count        0x0032   100   100   000    Old_age   Always       -       0
 11 Calibration_Retry_Count 0x0032   100   100   000    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       153
192 Power-Off_Retract_Count 0x0032   200   200   000    Old_age   Always       -       146
193 Load_Cycle_Count        0x0032   001   001   000    Old_age   Always       -       2263442
194 Temperature_Celsius     0x0022   121   099   000    Old_age   Always       -       29
196 Reallocated_Event_Count 0x0032   001   001   000    Old_age   Always       -       211
197 Current_Pending_Sector  0x0032   200   198   000    Old_age   Always       -       314
198 Offline_Uncorrectable   0x0030   200   198   000    Old_age   Offline      -       7
199 UDMA_CRC_Error_Count    0x0032   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0008   195   182   000    Old_age   Offline      -       1533

SMART Error Log Version: 1
ATA Error Count: 1
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1 occurred at disk power-on lifetime: 19836 hours (826 days + 12 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 51 00 81 f4 2c 00  Error: ABRT

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  b0 d4 00 81 4f c2 00 08      09:23:20.590  SMART EXECUTE OFF-LINE IMMEDIATE
  b0 d0 01 00 4f c2 00 08      09:23:20.584  SMART READ DATA
  ec 00 01 00 00 00 00 08      09:23:20.557  IDENTIFY DEVICE
  ec 00 01 00 00 00 00 08      09:23:20.546  IDENTIFY DEVICE
  b0 d5 01 09 4f c2 00 08      09:22:58.306  SMART READ LOG

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed: read failure       10%     19871         2921052808
# 2  Short offline       Completed: read failure       10%     19865         2919432424
# 3  Short offline       Completed: read failure       10%     19864         2919432448
# 4  Short offline       Aborted by host               90%     19864         -
# 5  Short offline       Completed: read failure       70%     19864         2884602426
# 6  Extended offline    Completed: read failure       90%     19837         2884602424
# 7  Extended offline    Completed: read failure       90%     19837         2882554270
# 8  Extended offline    Completed: read failure       90%     19837         2882554264
# 9  Extended offline    Completed: read failure       90%     19836         2882554266
#10  Short captive       Completed: read failure       80%     19836         2882554274
#11  Extended offline    Completed: read failure       90%     19836         2884602425
#12  Extended offline    Completed: read failure       90%     19836         2882554273
#13  Extended offline    Completed: read failure       90%     19833         2884602427
#14  Extended offline    Aborted by host               90%     19833         -
#15  Short offline       Completed: read failure       10%     19832         2919432408

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

The above only provides legacy SMART information - try 'smartctl -x' for more

truenas_admin@truenas[~]$ 

3 Likes

Some additional information to the post I posted above…

I’ve been testing TrueNAS with one damaged disk for a few hours now. I’ve installed a few docker containers and a few VMs, FreePBX is being installed. At the moment there’s no information about the damaged disk, although of course the scrutiny add-on beautifully shows that there’s a problem - the question is whether such an add-on is necessary on NAS to find out that ZFS will soon collapse…

…it’s starting to scare me a little :wink: