Hello, just reaching out to see if anybody else is seeing the same results as I am with the Radian RMS-200. To summarize the testing If the RMS-200 dies/removed it results in a hard down for the pool instead of showing a degraded state. I have tested this with 2 different RMS-200. It is recoverable though.
Truenas - Slog Failure Testing
Pre-Setup:
This is a test environment no data has been written to these drives. The purpose of this test is to check for pool survival if the SLOG device fails (RMS-200). When I refer to the SLOG device i am talking about the Radian RMS-200.
Platform: Generic
Version: TrueNAS-13.0-U6.1
CPU: Intel(R) Xeon(R) E-2146G CPU @ 3.50GHz
Slog Device: Radian RMS-200 rev04
Drive Layout: | 12 x Mirrored VDEV | ||
---|---|---|---|
4TB - Z1Z8AAPK | 4TB - Z1Z8AH2N | 4TB - Z1Z5Z66N | 4TB - Z1Z907ZS |
4TB - Z1ZAP2F5 | 4TB - Z1ZARRDN | 4TB - Z1ZARQY2 | 4TB - Z1ZARRWS |
4TB - Z1ZAT5N8 | 4TB - Z1ZARFSA | 4TB - Z1ZASLFE | 4TB - Z1ZAVJD9 |
4TB - Z1ZASMKQ | 4TB - Z1ZART09 | 4TB - Z1ZAVJAH | 4TB - Z1ZARQ6A |
4TB - Z1ZARSJM | 4TB - Z1ZART5R | 4TB - Z1ZARRKL | 4TB - Z1ZASM9Y |
4TB - Z1ZARQKG | 4TB - Z1ZARR9G | 4TB - Z1ZARS4R | 4TB - Z1ZAL92Y |
Test 1: Removal of slog device while system is powered off.
Description:
In this test I will be removing the SLOG device completely out of the system. The system still has AC power to the board, but is just powered down. No other system components will be changed during this test. The only variable is the Slog Device. After, the removal of the slog device we will be powering on the system.
Results:
Disks:
All disks are listed excepts the Slog device
Pool Status:
The pool shows offline and not in a degraded state.
ZPool Status in shell:
Only the boot pool shows up.
Conclusion:
The removal of the SLOG device while the system is powered off leads to the ZFS pool becoming offline. This results in a âhard downâ state of the pool, meaning the pool is inaccessible through the normal GUI methods until corrective action is taken. To resolve this, access to the system shell is essential to execute specific ZFS commands for pool recovery. Once the necessary steps are performed, involving the forced import of the pool and adjustments in the TrueNAS interface, the pool can be restored to a healthy state. This test highlights the importance of understanding the impact of SLOG device removal and the procedures required to recover from such scenarios. It underscores the resilience of ZFS pools against hardware changes but also the need for administrative intervention for recovery.
Fix:
- Importing the Pool:
- Use the command
zpool import -m -f <pool_name>
to forcefully import the pool.-m
mounts the file systems.-f
forces the import, useful if the pool was not properly exported or the system thinks itâs still in use.
- Example:
zpool import -m -f mypool
- Checking Pool Status:
- Run
zpool status
to check the health and status of the pool. It should now show the pool in a degraded state. Follow the next steps to add or remove the slog device.
This should also reflect in the GUI as well.
- Removing or Replacing the SLOG:
-
Access TrueNAS Web Interface.
-
Go to Storage > Pools > Status.
-
Find your pool and select the SLOG device you wish to replace or remove.
-
To remove:
- Select the SLOG device and choose the option to remove it.
-
To replace:
- Select the SLOG device and choose the option to replace it.
- Follow the prompts to add the new SLOG device.
Post-Operations Checks:
- Verify Pool Status: After replacing the SLOG, run
zpool status
again to ensure the pool is healthy. - Monitor Performance: Check if the performance meets your expectations.
- Data Integrity Check: Consider running a scrub to verify data integrity:
zpool scrub <pool_name>
.
- Importing the Pool:
- Use the command
zpool import -m -f <pool_name>
to forcefully import the pool.-m
mounts the file systems.-f
forces the import, useful if the pool was not properly exported or the system thinks itâs still in use.
- Example:
zpool import -m -f mypool
- Checking Pool Status:
- Run
zpool status
to check the health and status of the pool. It should now show the pool in a degraded state. Follow the next steps to add or remove the slog device.
This should also reflect in the GUI as well.
- Removing or Replacing the SLOG:
-
Access TrueNAS Web Interface.
-
Go to Storage > Pools > Status.
-
Find your pool and select the SLOG device you wish to replace or remove.
-
To remove:
- Select the SLOG device and choose the option to remove it.
-
To replace:
- Select the SLOG device and choose the option to replace it.
- Follow the prompts to add the new SLOG device.
Post-Operations Checks:
- Verify Pool Status: After replacing the SLOG, run
zpool status
again to ensure the pool is healthy. - Monitor Performance: Check if the performance meets your expectations.
- Data Integrity Check: Consider running a scrub to verify data integrity:
zpool scrub <pool_name>
.
Test 2: Removal of slog device while system is powered on.
Description:
In this test I will be removing the SLOG device completely out of the system. The system is powered on. No other system components will be changed during this test. The only variable is the Slog Device.
Results:
Disks:
All disks are listed excepts the Slog device
Pool Status:
The pool shows Degraded.
ZPool Status in shell:
The test pool shows degraded. With only the Slog device missing.
Conclusion:
The direct removal of the SLOG device from an active TrueNAS system resulted in immediate changes to the ZFS pool status. Unlike the first test where the system was powered off, removing the SLOG device from an operational system led to the pool being marked as âDegradedâ. This state reflects the absence of the SLOG device but also indicates that the pool is still functional, albeit without the benefits provided by the SLOG.
Fix:
-
Access TrueNAS Web Interface.
-
Go to Storage > Pools > Status.
-
Find your pool and select the SLOG device you wish to replace or remove.
-
To remove:
- Select the SLOG device and choose the option to remove it.
-
To replace:
- Select the SLOG device and choose the option to replace it.
- Follow the prompts to add the new SLOG device.
Post-Operations Checks:
-
Verify Pool Status:
-
Monitor Performance: Check if the performance meets your expectations.
-
Data Integrity Check: Consider running a scrub to verify data integrity:
zpool scrub <pool_name>
.
Test 3: Removal of slog device while system is powered off. With dip switch 8 towards the up position (Away from motherboard).
Description:
In this test I will be removing the SLOG device completely out of the system. The system still has AC power to the board, but is just powered down. With dip switch 8 towards the up position (Away from motherboard). No other system components will be changed during this test. The only variable is the Slog Device. After, the removal of the slog device we will be powering on the system.
Results:
Disks:
All disks are listed excepts the Slog device
Pool Status:
The pool shows offline and not in a degraded state.
ZPool Status in shell:
Only the boot pool shows up.
Conclusion:
The removal of the SLOG device while the system is powered off with dip switch 8 towards the up position (Away from motherboard) leads to the ZFS pool becoming offline. This results in a âhard downâ state of the pool, meaning the pool is inaccessible through the normal GUI methods until corrective action is taken. To resolve this, access to the system shell is essential to execute specific ZFS commands for pool recovery. Once the necessary steps are performed, involving the forced import of the pool and adjustments in the TrueNAS interface, the pool can be restored to a healthy state. This test highlights the importance of understanding the impact of SLOG device removal and the procedures required to recover from such scenarios. It underscores the resilience of ZFS pools against hardware changes but also the need for administrative intervention for recovery.
Fix:
[See test 1 Fix:]