More efficient one way up sync from TrueNAS to Proton Drive

I’m new to TrueNAS and got great help from this forum with various issues and figured I’d share my solution for anyone interested.

Mostly for backup and secure remote availability of my files (without exposing my NAS to the outside world) I wanted to have my document files synced to Proton Drive, but one way only: from my NAS to Proton. Running rclone on my entire dataset where I keep my personal files, which is quite a lot (2TB), is very time consuming (HOURS), so, I came up with this approach:

  1. keep a local DB of all my files
  2. on schedule, I do it every morning at 6 am, do a snapshot of local files
  3. compare the snapshot files to database records by name and modified time
  4. compile new/modified and deleted file lists
  5. upload to Proton Drive only the new/modified files and remove deleted files
  6. record snapshot as the current state of files in the database
  7. keep log of the process

This entire process completes in seconds or in minutes (if modified/new list is large) instead of hours running rclone comparing local files to remote.

So, I have the script below saved as .sh file in my home folder with required execute permission and schedule it via system->advanced->cron jobs

The pre-requisite is that you must run rcolone and setup new remote to your Proton Drive - it walks you through it step by step.

I hope it helps someone. The script is below

#!/bin/bash

USER_HOME="/mnt/home/YOUR_USERNAME"
SOURCE_DIR="/mnt/YOUR_DATAPOOL/MyFiles" #NAS location of your files
DB_DIR="/mnt/YOUR_DATAPOOL/home/YOUR_USERNAME/proton_file_tracker"
DB_FILE="$DB_DIR/file_tracker.db"
LOG_FILE="/mnt/YOUR_DATAPOOL/rclone_pd.log" #LOCATION OF THE LOG FILE

# Ensure the directory exists for storing DB and temp files
mkdir -p "$DB_DIR"

# Create the database if it doesn't exist
if [ ! -f "$DB_FILE" ]; then
    sqlite3 "$DB_FILE" <<SQL
CREATE TABLE files (path TEXT PRIMARY KEY, mtime INTEGER, size INTEGER);
CREATE TABLE metadata (key TEXT PRIMARY KEY, value INTEGER);
INSERT INTO metadata (key, value) VALUES ('last_run_time', 0);
SQL
fi

# Update the current file info in bulk
find "$SOURCE_DIR" -type f -printf "%p|%T@|%s\n" > "$DB_DIR/current_files.txt"

# Track current timestamp
CURRENT_TIMESTAMP=$(date +%s)

# Before the SQLite command, prepare a variable for the SQL query
# to reduce path of the files from absolute to relative
RELATIVE_SQL_CHANGED="SELECT substr(f.path, length('$SOURCE_DIR') + 1) AS path FROM files f LEFT JOIN files_last_run flr ON f.path = flr.path WHERE flr.path IS NULL OR f.mtime > COALESCE(flr.mtime, 0);"
RELATIVE_SQL_DELETED="SELECT substr(flr.path, length('$SOURCE_DIR') + 1) AS path FROM files_last_run flr WHERE flr.path NOT IN (SELECT path FROM files);"

# Performance tuning and database operations
sqlite3 "$DB_FILE" <<SQL > /dev/null 2>&1
PRAGMA synchronous=OFF;
PRAGMA journal_mode=WAL;
BEGIN TRANSACTION;

-- Backup the current files table as files_last_run
DROP TABLE IF EXISTS files_last_run;
ALTER TABLE files RENAME TO files_last_run;

-- Recreate files table and create indexes
CREATE TABLE files (path TEXT PRIMARY KEY, mtime INTEGER, size INTEGER);
CREATE INDEX IF NOT EXISTS idx_files_path ON files(path);
CREATE INDEX IF NOT EXISTS idx_files_last_run_path ON files_last_run(path);

-- Import new file data
.import $DB_DIR/current_files.txt files

-- Identify changes
.output $DB_DIR/changed_files.txt
$RELATIVE_SQL_CHANGED

-- Identify deletions
.output $DB_DIR/deleted_files.txt
$RELATIVE_SQL_DELETED

-- Update last run time after all changes have been identified
INSERT OR REPLACE INTO metadata (key, value) VALUES ('last_run_time', $CURRENT_TIMESTAMP);

COMMIT;
SQL

# Error handling for rclone operations
# Sync changed files
if [ -s "$DB_DIR/changed_files.txt" ]; then
    if ! rclone sync "$SOURCE_DIR" "proton_drive:Home-NAS/Docs" --files-from "$DB_DIR/changed_files.txt" --update --local-no-check-updated --protondrive-replace-existing-draft=true --log-file="$LOG_FILE" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO; then
        echo "Error syncing files. Details in $LOG_FILE" >> "$LOG_FILE"
        exit 1
    fi
fi

# Handle deletions
if [ -s "$DB_DIR/deleted_files.txt" ]; then
    if ! rclone delete "proton_drive:Home-NAS/Docs" --include-from "$DB_DIR/deleted_files.txt" --protondrive-replace-existing-draft=true --log-file="$LOG_FILE" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO; then
        echo "Error deleting files. Details in $LOG_FILE" >> "$LOG_FILE"
        exit 1
    fi
fi

# Clean up temporary files
rm -f "$DB_DIR/changed_files.txt" "$DB_DIR/deleted_files.txt" "$DB_DIR/current_files.txt"

echo "############################################################" >> "$LOG_FILE"
echo "Sync completed successfully." >> "$LOG_FILE"
echo "############################################################" >> "$LOG_FILE"
6 Likes

@weekends-rule
Thanks a lot for sharing! That helped me!

How did you solve the problem of having to enter the 2FA code?
Or did you simply not secure your proton account with 2FA?

I do have 2fa but when you first create the destination with rcron, it asks you for 2fa, once you enter it - rclone works by renewing the token automatically

Oh man, saving this thread. Thank you, kind friend.

oh, yep, i realized it myself after having it run for three nights in a row now.
Thanks!

Please note that directory tracking was not important to me because I designed this project as somewhat another backup option and to make my data securely available for infrequent online access. I did not implement any monitoring for directories.

What I mean is this: you create directory “My Files” on TrueNAS and this script will sync it to Proton. Then, you delete “My Files” on your NAS - this script will remove all files from “My Files” folder on Proton, but will not delete the folder itself. Also, if you rename “My Files” on your NAS to “My Files New” - this script will create new folder on Proton “My Files New”, sync files to it, remove all files from Proton “My Files” but will not delete the folder itself.

I dont have any plans yet to add directory monitoring but if it becomes nuisance, I will.

Aight! While I still don’t need the folder tracking, I couldn’t stand the fact that this was not “complete”, so I finished it. This version, while having file sync logic/mechanics unchanged, now has folder tracking.

The folder tracking is not done by tracking folders via the file system but examining what was uploaded to Proton Drive. This script now properly cleans up folders that were renamed or deleted. No more orphaned (empty) folders. As a safeguard to prevent mass deletion, if the current state of folders is missing (for instance when you run it the first time), the script will simply document the online folders list and only proceeds with removing online copies when they were confirmed removed by comparison to an existing previous state.

No folder deletions or any other file/folder system operations is ever done to the file system.

#!/bin/bash
# v2 - adding folder tracking

USER_HOME="/mnt/home/USERNAME"
SOURCE_DIR="/mnt/Datapool/Docs"
DB_DIR="/mnt/Datapool/home/USERNAME/proton_file_tracker"
DB_FILE="$DB_DIR/file_tracker.db"
LOG_FILE="/mnt/Datapool/home/USERNAME/LOGNAME.log"

mkdir -p "$(dirname "$LOG_FILE")"

log_debug() {
    echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
}

echo "" >> "$LOG_FILE"
printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
log_debug "Sync operation started"
printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"

log_footer() {
    printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
    log_debug "Sync completed successfully"
    log_debug "New/changed files: $NEW_CHANGED_COUNT"
    log_debug "Deleted files:     $DELETED_COUNT"
    log_debug "Deleted folders:   $DELETED_FOLDER_COUNT"
    printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
    echo "" >> "$LOG_FILE"
}

mkdir -p "$DB_DIR"

if [ ! -f "$DB_FILE" ]; then
    sqlite3 "$DB_FILE" <<SQL
CREATE TABLE files (path TEXT PRIMARY KEY, mtime INTEGER, size INTEGER);
CREATE TABLE metadata (key TEXT PRIMARY KEY, value INTEGER);
INSERT INTO metadata (key, value) VALUES ('last_run_time', 0);
SQL
fi

find "$SOURCE_DIR" -type f -printf "%p|%T@|%s\n" > "$DB_DIR/current_files.txt"

CURRENT_TIMESTAMP=$(date +%s)

RELATIVE_SQL_CHANGED="SELECT substr(f.path, length('$SOURCE_DIR') + 1) AS path FROM files f LEFT JOIN files_last_run flr ON f.path = flr.path WHERE flr.path IS NULL OR f.mtime > COALESCE(flr.mtime, 0);"
RELATIVE_SQL_DELETED="SELECT substr(flr.path, length('$SOURCE_DIR') + 1) AS path FROM files_last_run flr WHERE flr.path NOT IN (SELECT path FROM files);"

sqlite3 "$DB_FILE" <<SQL > /dev/null 2>&1
PRAGMA synchronous=OFF;
PRAGMA journal_mode=WAL;
BEGIN TRANSACTION;
DROP TABLE IF EXISTS files_last_run;
ALTER TABLE files RENAME TO files_last_run;
CREATE TABLE files (path TEXT PRIMARY KEY, mtime INTEGER, size INTEGER);
CREATE INDEX IF NOT EXISTS idx_files_path ON files(path);
CREATE INDEX IF NOT EXISTS idx_files_last_run_path ON files_last_run(path);
.import $DB_DIR/current_files.txt files
.output $DB_DIR/changed_files.txt
$RELATIVE_SQL_CHANGED
.output $DB_DIR/deleted_files.txt
$RELATIVE_SQL_DELETED
INSERT OR REPLACE INTO metadata (key, value) VALUES ('last_run_time', $CURRENT_TIMESTAMP);
COMMIT;
SQL

if [ -s "$DB_DIR/changed_files.txt" ]; then
    if ! rclone sync "$SOURCE_DIR" "proton_drive:Home-NAS/Docs" --files-from "$DB_DIR/changed_files.txt" --update --local-no-check-updated --protondrive-replace-existing-draft=true --log-file="$LOG_FILE" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO; then
        echo "Error syncing files. Details in $LOG_FILE" >> "$LOG_FILE"
        exit 1
    fi
fi

if [ -s "$DB_DIR/deleted_files.txt" ]; then
    if ! rclone delete "proton_drive:Home-NAS/Docs" --include-from "$DB_DIR/deleted_files.txt" --protondrive-replace-existing-draft=true --log-file="$LOG_FILE" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO; then
        echo "Error deleting files. Details in $LOG_FILE" >> "$LOG_FILE"
        exit 1
    fi
fi

FOLDER_DB_FILE="$DB_DIR/folder_tracker.db"
DELETED_FOLDERS_FILE="$DB_DIR/deleted_folders.txt"
CURRENT_FOLDERS_FILE="$DB_DIR/current_folders.txt"

FIRST_RUN=0
if [ ! -f "$FOLDER_DB_FILE" ]; then
    sqlite3 "$FOLDER_DB_FILE" <<SQL
CREATE TABLE folders (path TEXT PRIMARY KEY);
CREATE TABLE metadata (key TEXT PRIMARY KEY, value INTEGER);
INSERT INTO metadata (key, value) VALUES ('last_run_time', 0);
SQL
    FIRST_RUN=1
    log_debug "Folder tracking initializing: creating baseline state, no deletions will occur"
fi

awk -F'|' '{sub(/[^/]*$/, "", $1); print $1}' "$DB_DIR/current_files.txt" | sort -u > "$CURRENT_FOLDERS_FILE"

RELATIVE_SQL_DELETED_FOLDERS="SELECT substr(flr.path, length('$SOURCE_DIR') + 1) AS path FROM folders_last_run flr WHERE flr.path NOT IN (SELECT path FROM folders);"

sqlite3 "$FOLDER_DB_FILE" <<SQL > /dev/null 2>&1
PRAGMA synchronous=OFF;
PRAGMA journal_mode=WAL;
BEGIN TRANSACTION;
$( [ "$FIRST_RUN" -eq 0 ] && echo "DROP TABLE IF EXISTS folders_last_run; ALTER TABLE folders RENAME TO folders_last_run;" )
CREATE TABLE folders (path TEXT PRIMARY KEY);
.import --csv $CURRENT_FOLDERS_FILE folders
$( [ "$FIRST_RUN" -eq 0 ] && echo ".output $DELETED_FOLDERS_FILE" )
$( [ "$FIRST_RUN" -eq 0 ] && echo "$RELATIVE_SQL_DELETED_FOLDERS" )
INSERT OR REPLACE INTO metadata (key, value) VALUES ('last_run_time', $CURRENT_TIMESTAMP);
COMMIT;
SQL

if [ "$FIRST_RUN" -eq 0 ] && [ -s "$DELETED_FOLDERS_FILE" ]; then
    while IFS= read -r folder; do
        if ! rclone rmdir "proton_drive:Home-NAS/Docs/$folder" --log-file="$LOG_FILE" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO; then
            log_debug "Error deleting folder '$folder'. Details in $LOG_FILE"
        fi
    done < "$DELETED_FOLDERS_FILE"
fi

NEW_CHANGED_COUNT=$(wc -l < "$DB_DIR/changed_files.txt")
DELETED_COUNT=$(wc -l < "$DB_DIR/deleted_files.txt")
DELETED_FOLDER_COUNT=0
if [ "$FIRST_RUN" -eq 0 ]; then
    DELETED_FOLDER_COUNT=$(wc -l < "$DELETED_FOLDERS_FILE" 2>/dev/null || echo 0)
fi

rm -f "$DB_DIR/changed_files.txt" "$DB_DIR/deleted_files.txt" "$DB_DIR/current_files.txt" "$DELETED_FOLDERS_FILE" "$CURRENT_FOLDERS_FILE"

log_footer

I did some log clean up to filter out some noisy log messages from rclone since the operation is intentional:

A file or folder with that name already exists (Code=2500, Status=422)

so, now, rclone logs to a temp log file which then passed to the log writer that filters the noise and makes operation’s log very neat. Added some comments for readability

#!/bin/bash
# /mnt/Datapool/home/st0retron/proton_file_tracker/pd_file_sync_script.sh
# v2 - adding folder tracking
USER_HOME="/mnt/home/USERNAME"
SOURCE_DIR="/mnt/Datapool/Docs"
DB_DIR="/mnt/Datapool/home/USERNAME/proton_file_tracker"  # Shared directory for file and folder tracking DBs and temp files
DB_FILE="$DB_DIR/file_tracker.db"
LOG_FILE="/mnt/PATH TO LOG FOLDER/LOGNAME.log"

mkdir -p "$(dirname "$LOG_FILE")"

write_log() {
    if [ -f "$1" ]; then
        # Process rclone temp log file, filtering out 422 errors
        while IFS= read -r line; do
            if ! echo "$line" | grep -q "A file or folder with that name already exists (Code=2500, Status=422)"; then
                echo "$(date '+%Y-%m-%d %H:%M:%S') - $line" >> "$LOG_FILE"
            fi
        done < "$1"
    else
        # Regular log message
        echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
    fi
}

echo "" >> "$LOG_FILE"
printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
write_log "Sync operation started"
printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"

log_footer() {
    printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
    write_log "Sync completed successfully"
    write_log "New/changed files: $NEW_CHANGED_COUNT"
    write_log "Deleted files:     $DELETED_COUNT"
    write_log "Deleted folders:   $DELETED_FOLDER_COUNT"
    printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
    echo "" >> "$LOG_FILE"
}

mkdir -p "$DB_DIR"

if [ ! -f "$DB_FILE" ]; then
    sqlite3 "$DB_FILE" <<SQL
CREATE TABLE files (path TEXT PRIMARY KEY, mtime INTEGER, size INTEGER);
CREATE TABLE metadata (key TEXT PRIMARY KEY, value INTEGER);
INSERT INTO metadata (key, value) VALUES ('last_run_time', 0);
SQL
fi

find "$SOURCE_DIR" -type f -printf "%p|%T@|%s\n" > "$DB_DIR/current_files.txt"

CURRENT_TIMESTAMP=$(date +%s)

RELATIVE_SQL_CHANGED="SELECT substr(f.path, length('$SOURCE_DIR') + 1) AS path FROM files f LEFT JOIN files_last_run flr ON f.path = flr.path WHERE flr.path IS NULL OR f.mtime > COALESCE(flr.mtime, 0);"
RELATIVE_SQL_DELETED="SELECT substr(flr.path, length('$SOURCE_DIR') + 1) AS path FROM files_last_run flr WHERE flr.path NOT IN (SELECT path FROM files);"

sqlite3 "$DB_FILE" <<SQL > /dev/null 2>&1
PRAGMA synchronous=OFF;
PRAGMA journal_mode=WAL;
BEGIN TRANSACTION;
DROP TABLE IF EXISTS files_last_run;
ALTER TABLE files RENAME TO files_last_run;
CREATE TABLE files (path TEXT PRIMARY KEY, mtime INTEGER, size INTEGER);
CREATE INDEX IF NOT EXISTS idx_files_path ON files(path);
CREATE INDEX IF NOT EXISTS idx_files_last_run_path ON files_last_run(path);
.import $DB_DIR/current_files.txt files
.output $DB_DIR/changed_files.txt
$RELATIVE_SQL_CHANGED
.output $DB_DIR/deleted_files.txt
$RELATIVE_SQL_DELETED
INSERT OR REPLACE INTO metadata (key, value) VALUES ('last_run_time', $CURRENT_TIMESTAMP);
COMMIT;
SQL

if [ -s "$DB_DIR/changed_files.txt" ]; then
    TEMP_LOG="$LOG_FILE.tmp"
    if ! rclone sync "$SOURCE_DIR" "proton_drive:Home-NAS/Docs" --files-from "$DB_DIR/changed_files.txt" --update --local-no-check-updated --protondrive-replace-existing-draft=true --log-file="$TEMP_LOG" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO --stats 0; then
        echo "Error syncing files. Details in $TEMP_LOG" >> "$LOG_FILE"
        exit 1
    fi
    write_log "$TEMP_LOG"
    rm -f "$TEMP_LOG"
fi

if [ -s "$DB_DIR/deleted_files.txt" ]; then
    TEMP_LOG="$LOG_FILE.tmp"
    if ! rclone delete "proton_drive:Home-NAS/Docs" --include-from "$DB_DIR/deleted_files.txt" --protondrive-replace-existing-draft=true --log-file="$TEMP_LOG" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO --stats 0; then
        echo "Error deleting files. Details in $TEMP_LOG" >> "$LOG_FILE"
        exit 1
    fi
    write_log "$TEMP_LOG"
    rm -f "$TEMP_LOG"
fi

# Folder tracking section using the same DB_DIR
FOLDER_DB_FILE="$DB_DIR/folder_tracker.db"
DELETED_FOLDERS_FILE="$DB_DIR/deleted_folders.txt"
CURRENT_FOLDERS_FILE="$DB_DIR/current_folders.txt"

FIRST_RUN=0
if [ ! -f "$FOLDER_DB_FILE" ]; then
    sqlite3 "$FOLDER_DB_FILE" <<SQL
CREATE TABLE folders (path TEXT PRIMARY KEY);
CREATE TABLE metadata (key TEXT PRIMARY KEY, value INTEGER);
INSERT INTO metadata (key, value) VALUES ('last_run_time', 0);
SQL
    FIRST_RUN=1
    write_log "Folder tracking initializing: creating baseline state, no deletions will occur"
fi

awk -F'|' '{sub(/[^/]*$/, "", $1); print $1}' "$DB_DIR/current_files.txt" | sort -u > "$CURRENT_FOLDERS_FILE"

RELATIVE_SQL_DELETED_FOLDERS="SELECT substr(flr.path, length('$SOURCE_DIR') + 1) AS path FROM folders_last_run flr WHERE flr.path NOT IN (SELECT path FROM folders) ORDER BY path DESC;"

sqlite3 "$FOLDER_DB_FILE" <<SQL > /dev/null 2>&1
PRAGMA synchronous=OFF;
PRAGMA journal_mode=WAL;
BEGIN TRANSACTION;
$( [ "$FIRST_RUN" -eq 0 ] && echo "DROP TABLE IF EXISTS folders_last_run; ALTER TABLE folders RENAME TO folders_last_run;" )
CREATE TABLE folders (path TEXT PRIMARY KEY);
.import --csv $CURRENT_FOLDERS_FILE folders
$( [ "$FIRST_RUN" -eq 0 ] && echo ".output $DELETED_FOLDERS_FILE" )
$( [ "$FIRST_RUN" -eq 0 ] && echo "$RELATIVE_SQL_DELETED_FOLDERS" )
INSERT OR REPLACE INTO metadata (key, value) VALUES ('last_run_time', $CURRENT_TIMESTAMP);
COMMIT;
SQL

if [ "$FIRST_RUN" -eq 0 ] && [ -s "$DELETED_FOLDERS_FILE" ]; then
    TEMP_LOG="$LOG_FILE.tmp"
    while IFS= read -r folder; do
        if ! rclone rmdir "proton_drive:Home-NAS/Docs/$folder" --log-file="$TEMP_LOG" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO --stats 0; then
            write_log "Error deleting folder '$folder'. Details in $TEMP_LOG"
        fi
        write_log "$TEMP_LOG"
        rm -f "$TEMP_LOG"
    done < "$DELETED_FOLDERS_FILE"
fi

NEW_CHANGED_COUNT=$(wc -l < "$DB_DIR/changed_files.txt")
DELETED_COUNT=$(wc -l < "$DB_DIR/deleted_files.txt")
DELETED_FOLDER_COUNT=0
if [ "$FIRST_RUN" -eq 0 ]; then
    DELETED_FOLDER_COUNT=$(wc -l < "$DELETED_FOLDERS_FILE" 2>/dev/null || echo 0)
fi

rm -f "$DB_DIR/changed_files.txt" "$DB_DIR/deleted_files.txt" "$DB_DIR/current_files.txt" "$DELETED_FOLDERS_FILE" "$CURRENT_FOLDERS_FILE"

log_footer

One more addition: running with --resetstate parameter will clear out both (file and folder tracking) databases and set the current state of files. This could be useful when manually adding files or folders to Proton Drive and avoiding repeating uploading by the script. So, basically, --resetstate will catalog all files in the SOURCE and consider them being online and only sync future changes.

#!/bin/bash
# /mnt/Datapool/home/USERNAME/proton_file_tracker/pd_file_sync_script.sh
# v2 - added folder tracking
# v3 - added reset state mechanism - to take a snapshot of the SOURCE_DIR state and consider them already online. Useful when manually uploading files or renaming folders that would otherwise trigger large deletes and re-uploads
# v.3.3
# there is a log upload command at the very end of this script but only needed to update the log file if it is stored within the SOURCE DIR
# if the log file stored somewhere else and is not uploaded to the Proton Drive, the very last rclone sync command is not needed
# comment it out

SOURCE_DIR="/mnt/Datapool/SOURCE_DIR"	#source dir to sync up, dont include the last forward slash: something/something/source
REMOTE_DIR="proton_drive:PROTON_DESTINATION_DIR" #destination on Proton Drive for synced files, dont include the last forward slash
LOG_FILE="/mnt/Datapool/Docs/PATH_TO_LOG/LOGFILENAME.log" #full path to log file
DB_DIR="/mnt/Datapool/home/USERNAME/proton_file_tracker" #dir where databases and temp files are stored

DB_FILES="$DB_DIR/file_tracker.db" #db holding files state for tracking new/changed/deleted files
CURRENT_FILES="$DB_DIR/current_files.txt" #temp files
CHANGED_FILES="$DB_DIR/changed_files.txt"
DELETED_FILES="$DB_DIR/deleted_files.txt"

DB_FOLDERS="$DB_DIR/folder_tracker.db" #db holding folders state to track deleted folders
CURRENT_FOLDERS="$DB_DIR/current_folders.txt" #temp files
DELETED_FOLDERS="$DB_DIR/deleted_folders.txt"

#variables to keep track for state reset or when 
RESET_STATE=0
FIRST_RUN=0

# Define write_log function early
write_log() {
    if [ -f "$1" ]; then
        while IFS= read -r line; do
            if ! echo "$line" | grep -q "A file or folder with that name already exists (Code=2500, Status=422)"; then
                if ! echo "$line" | grep -qE "Transferred:|Errors:|Checks:|Deleted:|Elapsed time:"; then
                    echo "$(date '+%Y-%m-%d %H:%M:%S') - $line" >> "$LOG_FILE"
                fi
            fi
        done < "$1"
    else
        echo "$(date '+%Y-%m-%d %H:%M:%S') - $1" >> "$LOG_FILE"
    fi
}

# Initial log header
mkdir -p "$(dirname "$LOG_FILE")"
echo "" >> "$LOG_FILE"
printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
write_log "Sync operation started"
printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"

# Check for --resetstate parameter (case-insensitive)
for arg in "$@"; do
    if [[ "${arg,,}" == "--resetstate" ]]; then
        RESET_STATE=1
        write_log "Resetting state with --resetstate parameter"
        rm -f "$DB_FILES" "$DB_FOLDERS"
        break
    fi
done

mkdir -p "$DB_DIR"

log_footer() {
    printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
    write_log "Sync completed successfully"
    write_log "New/changed files: $NEW_CHANGED_COUNT"
    write_log "Deleted files:     $DELETED_COUNT"
    write_log "Deleted folders:   $DELETED_FOLDER_COUNT"
    printf "#%.0s" {1..80} >> "$LOG_FILE"; echo >> "$LOG_FILE"
    echo "" >> "$LOG_FILE"
}

if [ ! -f "$DB_FILES" ]; then
    sqlite3 "$DB_FILES" <<SQL
CREATE TABLE files (path TEXT PRIMARY KEY, mtime INTEGER, size INTEGER);
CREATE TABLE metadata (key TEXT PRIMARY KEY, value INTEGER);
INSERT INTO metadata (key, value) VALUES ('last_run_time', 0);
SQL
fi

find "$SOURCE_DIR" -type f -printf "%p|%T@|%s\n" > "$CURRENT_FILES"

CURRENT_TIMESTAMP=$(date +%s)

RELATIVE_SQL_CHANGED="SELECT substr(f.path, length('$SOURCE_DIR') + 1) AS path FROM files f LEFT JOIN files_last_run flr ON f.path = flr.path WHERE flr.path IS NULL OR f.mtime > COALESCE(flr.mtime, 0);"
RELATIVE_SQL_DELETED="SELECT substr(flr.path, length('$SOURCE_DIR') + 1) AS path FROM files_last_run flr WHERE flr.path NOT IN (SELECT path FROM files);"

sqlite3 "$DB_FILES" <<SQL > /dev/null 2>&1
PRAGMA synchronous=OFF;
PRAGMA journal_mode=WAL;
BEGIN TRANSACTION;
DROP TABLE IF EXISTS files_last_run;
ALTER TABLE files RENAME TO files_last_run;
CREATE TABLE files (path TEXT PRIMARY KEY, mtime INTEGER, size INTEGER);
CREATE INDEX IF NOT EXISTS idx_files_path ON files(path);
CREATE INDEX IF NOT EXISTS idx_files_last_run_path ON files_last_run(path);
.import $CURRENT_FILES files
.output $CHANGED_FILES
$RELATIVE_SQL_CHANGED
.output $DELETED_FILES
$RELATIVE_SQL_DELETED
INSERT OR REPLACE INTO metadata (key, value) VALUES ('last_run_time', $CURRENT_TIMESTAMP);
COMMIT;
SQL

if [ -s "$CHANGED_FILES" ] && [ "$RESET_STATE" -eq 0 ]; then
    TEMP_LOG="$LOG_FILE.tmp"
    if ! rclone sync "$SOURCE_DIR" "$REMOTE_DIR" --files-from "$CHANGED_FILES" --update --local-no-check-updated --protondrive-replace-existing-draft=true --log-file="$TEMP_LOG" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO --stats 0; then
        echo "Error syncing files. Details in $TEMP_LOG" >> "$LOG_FILE"
        exit 1
    fi
    write_log "$TEMP_LOG"
    rm -f "$TEMP_LOG"
fi

if [ -s "$DELETED_FILES" ] && [ "$RESET_STATE" -eq 0 ]; then
    TEMP_LOG="$LOG_FILE.tmp"
    if ! rclone delete "$REMOTE_DIR" --include-from "$DELETED_FILES" --protondrive-replace-existing-draft=true --log-file="$TEMP_LOG" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO --stats 0; then
        echo "Error deleting files. Details in $TEMP_LOG" >> "$LOG_FILE"
        exit 1
    fi
    write_log "$TEMP_LOG"
    rm -f "$TEMP_LOG"
fi


if [ ! -f "$DB_FOLDERS" ]; then
    sqlite3 "$DB_FOLDERS" <<SQL
CREATE TABLE folders (path TEXT PRIMARY KEY);
CREATE TABLE metadata (key TEXT PRIMARY KEY, value INTEGER);
INSERT INTO metadata (key, value) VALUES ('last_run_time', 0);
SQL
    FIRST_RUN=1
    write_log "Creating folder tracking initial state"
fi

awk -F'|' '{sub(/[^/]*$/, "", $1); print $1}' "$CURRENT_FILES" | sort -u > "$CURRENT_FOLDERS"

RELATIVE_SQL_DELETED_FOLDERS="SELECT substr(flr.path, length('$SOURCE_DIR') + 1) AS path FROM folders_last_run flr WHERE flr.path NOT IN (SELECT path FROM folders) ORDER BY path DESC;"

sqlite3 "$DB_FOLDERS" <<SQL > /dev/null 2>&1
PRAGMA synchronous=OFF;
PRAGMA journal_mode=WAL;
BEGIN TRANSACTION;
$( [ "$FIRST_RUN" -eq 0 ] && echo "DROP TABLE IF EXISTS folders_last_run; ALTER TABLE folders RENAME TO folders_last_run;" )
CREATE TABLE folders (path TEXT PRIMARY KEY);
.import --csv $CURRENT_FOLDERS folders
$( [ "$FIRST_RUN" -eq 0 ] && echo ".output $DELETED_FOLDERS" )
$( [ "$FIRST_RUN" -eq 0 ] && echo "$RELATIVE_SQL_DELETED_FOLDERS" )
INSERT OR REPLACE INTO metadata (key, value) VALUES ('last_run_time', $CURRENT_TIMESTAMP);
COMMIT;
SQL

if [ "$FIRST_RUN" -eq 0 ] && [ -s "$DELETED_FOLDERS" ]; then
    TEMP_LOG="$LOG_FILE.tmp"
    while IFS= read -r folder; do
        if ! rclone rmdir "$REMOTE_DIR${folder}" --log-file="$TEMP_LOG" --log-format "date:'2006-01-02 15:04:05',level,msg" --log-level INFO --stats 0; then
            write_log "Error deleting folder '$folder'. Details in $TEMP_LOG"
        fi
        write_log "$TEMP_LOG"
        rm -f "$TEMP_LOG"
    done < "$DELETED_FOLDERS"
fi

NEW_CHANGED_COUNT=$(wc -l < "$CHANGED_FILES")
DELETED_COUNT=$(wc -l < "$DELETED_FILES")
DELETED_FOLDER_COUNT=0
if [ "$FIRST_RUN" -eq 0 ]; then
    DELETED_FOLDER_COUNT=$(wc -l < "$DELETED_FOLDERS" 2>/dev/null || echo 0)
fi

rm -f "$CHANGED_FILES" "$DELETED_FILES" "$CURRENT_FILES" "$DELETED_FOLDERS" "$CURRENT_FOLDERS"

log_footer

# final log file push to update the log file that was already partially uploaded during the chagned files sync, but
# it has more entries by now, so need to sync up to get the full log, otherwise the online log file will only have
# the full log info up to the last sync operation
log_footer
LOG_DIR=$(dirname "$LOG_FILE")
rclone sync "$LOG_FILE" "$REMOTE_DIR/${LOG_DIR#$SOURCE_DIR/}" --update --local-no-check-updated --protondrive-replace-existing-draft=true --stats 0 > /dev/null 2>&1

1 Like

this looks really awesome! thank you for sharing your work!
How will this react if the upload gets interrupted (either lost of connection or issue with the server)? On the next run will it know that some new/changed files were not uploaded?

Thanks for this guide and script. I will give it a shot on my server.