I realized that some photo images on a specific folder has been corrupted.
I have at least 15-20 photos that i can’t neither open, plus a lot more that render with artifacts.
Don’t know why and when this happen, and how this is not been detected by scrubs… for now plus important thing for me is that i should have backups for clean this mess, at least for that specific photos that were quite olds.
What can be the best strategy, excluding check manually every folder (24k photo…) for retrieve those cases? Exists some software or app that can do that job?
Can this have happened somehow using an app (they are uploaded by nextcloud and indexed by photoprism)? I realized that going watch photoprism logs 2024-11-30 17:41:22 ERRO index: failed to generate thumbnails for 'nextcloud/celeste/files/InstantUpload/WhatsApp Images/IMG-20201022-WA0001.jpg' (VipsJpeg: Corrupt JPEG data: premature end of data segment VipsJpeg: Unsupported marker type 0x5a Stack: goroutine 235 [running]: runtime/debug.Stack() /usr/local/go/src/runtime/debug/stack.go:24 +0x5e github.com/davidbyttow/govips/v2/vips.handleVipsError()
Not finding any “ready to use solution”, i literally build my own scripts to achieve this with the less pain possible (and avoiding a mass restore).
This post give me good ints, pillow made this job really good! I’m sharing the experience (and if someone want, have the scripts too).
Except some false positive (on not real photo images, like phone screenshots, ecc), i manage to detect massively ca 150 photos that has been corrupted, nor with error that avoid the opening and some artifact; saved the path on a json, with another script i manage to search the same photo on a specific backup source → copy the good one in a relative path → bulk transfer of all the good copy in one way operation.
This is good too because i can just scan folders, and see if this continue happening, or has been an isolated case
Here’s a thread, in which the culprit was the network card.
Here’s another thread of a similar issue, in which SMB was the culprit.
This can also happen with interrupted or incomplete downloads. You’ll see that the image can be viewed from the top, until it hits a point in the JPEG where there is no more image data.
ZFS is actually working correctly. The image, as it “corruptly” exists with its blocks on the storage drives, was checksummed and then written by ZFS. The “corrupt” image file is the “correct” data, according to ZFS or any data integrity tools that confirm what has been initially written to disk.
ZFS can only protect your data against corruption after it has already been written to storage. There’s nothing it can do beforehand if the data is ruined by a bad NIC, bad RAM, or buggy software, such as SMB.
For what I see, after have scanned all the photo media, Is that not recently photo are involved but only ones coming from the old phone of my GF (2019/2020); and those photos are not uploaded by nextcloud but just sitting there (giving her the sensation to have everything in one place ). And make a lot of sense that something goes wrong during the initial transfer.
Just 1 only photo founded in another pool but same source.
The annoyng thing now Is that, going deeper in the file checks, i discovered that some video are corrupted too. Same origin of the photo. Don’t think i can use python there to achieve the check massively, gonna found a solution.
The things that scare me instead Is the possibility that some corrupted file can be backupped on good file.