NASTru
February 24, 2025, 5:23pm
1
Hello
Wanted to restore the pool from backup. However now cannot mount backup drive:
Using TrueNAS CORE 13.0-U6.7
# zpool import
pool: usb_bkp
id: 16136177666929402786
state: UNAVAIL
status: One or more devices contains corrupted data.
action: The pool cannot be imported due to damaged devices or data.
see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-5E
config:
usb_bkp UNAVAIL insufficient replicas
ada0 UNAVAIL invalid label
dumping the label:
# zdb -ll /dev/ada0
failed to unpack label 0
failed to unpack label 1
------------------------------------
LABEL 2 (Bad label cksum)
------------------------------------
version: 5000
name: 'usb_bkp'
state: 1
txg: 2796501
pool_guid: 16136177666929402786
errata: 0
hostid: 677086199
hostname: 'jt-nas.local'
top_guid: 13417891300543037129
guid: 13417891300543037129
vdev_children: 1
vdev_tree:
type: 'disk'
id: 0
guid: 13417891300543037129
path: '/dev/gptid/64e94f32-52ed-11ed-a081-94de80a78ddd'
metaslab_array: 128
metaslab_shift: 34
ashift: 12
asize: 19998435966976
is_log: 0
DTL: 84970
create_txg: 4
features_for_read:
com.delphix:hole_birth
com.delphix:embedded_data
labels = 2
ZFS Label NVList Config Stats:
1068 bytes used, 113580 bytes free (using 0.9%)
integers: 18 660 bytes (61.80%)
strings: 4 192 bytes (17.98%)
booleans: 2 92 bytes ( 8.61%)
nvlists: 3 124 bytes (11.61%)
failed to unpack label 3
Please help…
Your drive should be partitioned.
What does this reveal?
zdb -ll /dev/ada0p2
NASTru
February 26, 2025, 9:41am
3
The drive has about 14TB of data. I had copied data using replication task. Had originally connected via USB, it took a week to backup all the data.
Here is output:
# zdb -ll /dev/ada0p2
cannot open '/dev/ada0p2': No such file or directory
# zdb -ll /dev/ada0p0
cannot open '/dev/ada0p0': No such file or directory
# zdb -ll /dev/ada0p1
cannot open '/dev/ada0p1': No such file or directory
You might want to run a short or long SMART selftest on the drive. If it’s failing, and that’s the reason for the bad label, then it could explain what you’re seeing.
How long have you been using the USB drive?
If the drive was set up within TrueNAS, I would expect to see partitions. If it’s using the whole disk, that seems unusual.
Can you show the result of gpart list ada0
?
NASTru
February 26, 2025, 10:00pm
6
Yeah, the drive was setup within TrueNAS.
# gpart list ada0
Geom name: ada0
modified: false
state: OK
fwheads: 16
fwsectors: 63
last: 4294967294
first: 63
entries: 4
scheme: MBR
Consumers:
1. Name: ada0
Mediasize: 20000588955648 (18T)
Sectorsize: 512
Stripesize: 4096
Stripeoffset: 0
Mode: r0w0e0
NASTru
February 26, 2025, 10:16pm
7
The drive was in USB enclosure for about 2 months or so. Not actively used. Used for one time backup.
Now, i have connected the drive to SATA port. Here is the output of smart:
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 083 064 044 Pre-fail Always - 205488793
3 Spin_Up_Time 0x0003 090 089 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 106
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 080 060 045 Pre-fail Always - 109323439
9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 9931
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 44
18 Unknown_Attribute 0x000b 100 100 050 Pre-fail Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 099 000 Old_age Always - 8590065666
190 Airflow_Temperature_Cel 0x0022 069 048 000 Old_age Always - 31 (Min/Max 24/38)
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 21
193 Load_Cycle_Count 0x0032 070 070 000 Old_age Always - 61889
194 Temperature_Celsius 0x0022 031 051 000 Old_age Always - 31 (0 19 0 0 0)
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0023 100 100 001 Pre-fail Always - 0
240 Head_Flying_Hours 0x0000 100 100 000 Old_age Offline - 3335 (80 242 0)
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 31094970992
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 270653652407
Odd, I thought that even the newest CORE was still using slices/partitions. For it to do a whole-disk
is unusual in my mind.
If we can back up your labels by dd
ing them to a separate file first, we might be able to zhack label repair
this one - but without a backup of your backup I’m a bit hesitant to suggest that.
dd if=/dev/ada0 bs=1M count=32 of=first32M.bin
dd if=/dev/ada0 bs=1M count=32 skip=20000555401216 iflag=skip_bytes of=last32M.bin
This should copy the first and last 32M of your disk, respectively, to those files in your current working directory. Then you might be able to take a shot at letting zhack
rebuild a correct checksum on that disk.
NASTru
February 27, 2025, 4:29pm
10
Could not backup the last 32M, got this error:
# dd if=/dev/ada0 bs=1M count=32 seek=20000555401216 of=last32M.bin
dd: seek offsets cannot be larger than 9223372036854775807
Is this right seek point?
20000588955648 - 32*2048 = 20000588890112 blocks
But even this gives the same error:
# dd if=/dev/ada0 bs=1M count=32 seek=20000588890112 of=last32M.bin
dd: seek offsets cannot be larger than 9223372036854775807
With a bs=1M
, dd is interpreting each block as 1-MiB in size. This affects the seek
as well.
1 Like
I thought seek
would just go by bytes. Apparently it’s something other than that.
Okay, I got it - we need skip
not seek
.
dd if=/dev/ada0 bs=1M count=32 skip=20000588890112 iflag=skip_bytes of=last32M.bin
This should make it behave as intended.
I approve of the above command.
I totally did not edit my post after looking like a fool because I am infallible .
I totally never wrote this.
I believe skip
is also dictated by the blocksize (bs
).
1 Like
Once you get them both dumped, pick into them a bit with
hexdump -C first32M.bin | head -n 40
and look for the telltale ZFS headers and labels:
root@truenas[/home/truenas_admin]# hexdump -C first32M.bin | head -n 40
00000000 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 30 |0000000000000000|
*
00002000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00003fd0 00 00 00 00 00 00 00 00 11 7a 0c b1 7a da 10 02 |.........z..z...|
00003fe0 3f 2a 6e 7f 80 8f f4 97 fc ce aa 58 16 9f 90 af |?*n........X....|
00003ff0 8b b4 6d ff 57 ea d1 cb ab 5f 46 0d db 92 c6 6e |..m.W...._F....n|
00004000 01 01 00 00 00 00 00 00 00 00 00 01 00 00 00 24 |...............$|
00004010 00 00 00 20 00 00 00 07 76 65 72 73 69 6f 6e 00 |... ....version.|
00004020 00 00 00 08 00 00 00 01 00 00 00 00 00 00 13 88 |................|
00004030 00 00 00 20 00 00 00 20 00 00 00 04 6e 61 6d 65 |... ... ....name|
00004040 00 00 00 09 00 00 00 01 00 00 00 04 68 75 72 72 |............hurr|
00004050 00 00 00 24 00 00 00 20 00 00 00 05 73 74 61 74 |...$... ....stat|
00004060 65 00 00 00 00 00 00 08 00 00 00 01 00 00 00 00 |e...............|
00004070 00 00 00 00 00 00 00 20 00 00 00 20 00 00 00 03 |....... ... ....|
00004080 74 78 67 00 00 00 00 08 00 00 00 01 00 00 00 00 |txg.............|
00004090 00 08 8f 43 00 00 00 28 00 00 00 28 00 00 00 09 |...C...(...(....|
000040a0 70 6f 6f 6c 5f 67 75 69 64 00 00 00 00 00 00 08 |pool_guid.......|
your error is throwing about a bad label checksum which is what zhack label repair
is designed to be able to rebuild but I obviously want to ensure we back up what’s there first so we can revert if need be.
NASTru
February 27, 2025, 5:05pm
16
# dd if=/dev/ada0 bs=1M count=32 skip=20000588890112 iflag=skip_bytes of=last32M.bin
dd: unknown iflag skip_bytes
Looking at man page :
only option for iflag is fullblock and direct
Might be a FreeBSD vs Linux thing?
NASTru
February 27, 2025, 5:08pm
18
Here is the output of hexdump:
# hexdump -C first32M.bin | head -n 40
00000000 fc 31 c0 8e c0 8e d8 8e d0 bc 00 0e be 1a 7c bf |.1............|.|
00000010 1a 06 b9 e6 01 f3 a4 e9 00 8a be 2d 06 eb 07 bb |...........-....|
00000020 07 00 b4 0e cd 10 ac 84 c0 75 f4 eb fe 54 68 69 |.........u...Thi|
00000030 73 20 69 73 20 61 20 46 72 65 65 4e 41 53 20 64 |s is a FreeNAS d|
00000040 61 74 61 20 64 69 73 6b 20 61 6e 64 20 63 61 6e |ata disk and can|
00000050 20 6e 6f 74 20 62 6f 6f 74 20 73 79 73 74 65 6d | not boot system|
00000060 2e 20 20 53 79 73 74 65 6d 20 68 61 6c 74 65 64 |. System halted|
00000070 2e 00 9d 6b bd 83 41 7f dc 11 be 0b 00 15 60 b8 |...k..A.......`.|
00000080 4f 0f 90 90 90 90 90 90 90 90 90 90 90 90 90 90 |O...............|
00000090 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 |................|
*
000001b0 90 90 90 90 90 90 90 90 00 00 00 00 00 00 00 00 |................|
000001c0 02 00 ee ff ff ff 01 00 00 00 ff ff ff ff 00 00 |................|
000001d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.|
00000200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00001000 45 46 49 20 50 41 52 54 00 00 01 00 5c 00 00 00 |EFI PART....\...|
00001010 14 44 47 ba 00 00 00 00 01 00 00 00 00 00 00 00 |.DG.............|
00001020 fe ff 0b 23 01 00 00 00 06 00 00 00 00 00 00 00 |...#............|
00001030 f9 ff 0b 23 01 00 00 00 d9 7b 92 64 ed 52 ed 11 |...#.....{.d.R..|
00001040 a0 81 94 de 80 a7 8d dd 02 00 00 00 00 00 00 00 |................|
00001050 80 00 00 00 80 00 00 00 bf de 7e 5d 00 00 00 00 |..........~]....|
00001060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00002000 b5 7c 6e 51 cf 6e d6 11 8f f8 00 02 2d 09 71 2b |.|nQ.n......-.q+|
00002010 ce d4 ad 64 ed 52 ed 11 a0 81 94 de 80 a7 8d dd |...d.R..........|
00002020 80 00 00 00 00 00 00 00 7f 00 08 00 00 00 00 00 |................|
00002030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00002080 ba 7c 6e 51 cf 6e d6 11 8f f8 00 02 2d 09 71 2b |.|nQ.n......-.q+|
00002090 32 4f e9 64 ed 52 ed 11 a0 81 94 de 80 a7 8d dd |2O.d.R..........|
000020a0 80 00 08 00 00 00 00 00 f9 ff 0b 23 01 00 00 00 |...........#....|
000020b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
02000000
That looks like the beginning of the drive.
EDIT: I need coffee. I keep missing key words in people’s posts.
NASTru
February 27, 2025, 5:50pm
20
Without the iflag
I got following error:
# dd if=/dev/ada0 bs=1M count=32 skip=20000588890112 of=last32M.bin
dd: seek offsets cannot be larger than 18446744073709551615