I wrote a python script to backup my home directory (codeberg.org)
from waspentalive@lemmy.one to selfhosted@lemmy.world on 29 Mar 14:26
https://lemmy.one/post/26390388

Loci is a python script that can backup a directory to a server using rsync - It keeps track of the backups that have been done. Multiple backups may be kept. Rsync is used to handle the backups so only the needfull is copied and single files can be recovered from the backup if needed. loci -b tag : Backup under the tag given (I used days of the week)

loci -l : List backups showing those tags unused, backups that are needed, and backups that been run more than 5 times. I refresh these.

loci -r tag : Refresh a tag’s backup - delete the files under that tag and backuplog entries to prepare for a fresh backup using loci -b

~/.backuplog a file in .csv format that keeps track of backups done.

~/.config/loci/settings Settings file. Fully commented.

#selfhosted

threaded - newest

demeaning_casually@infosec.pub on 29 Mar 15:13 next collapse

A hilariously unnecessary Python script that could have easily been done in bash since it’s literally just a wrapper around rsync. 😅

When you’ve only got a Python-sized hammer in your toolbox, everything looks like a Python nail, I guess.

#!/bin/bash

# Function to read settings
# Settings file format:
# ~/.config/loci/settings
# [backup]
#
# server = <<Name of server>>
# user = <<server user login>>
# backup_root = <<Directory off user's home Directory>>
# taglist = mon tue wed thu fri sat sun spc
# exclude_files = <<not implemented yet - leave blank>>
# source_dir = <<the local directory we are backing up>>
read_settings() {
  settings_file="$HOME/.config/loci/settings"
  if [[ -f "$settings_file" ]]; then
    while IFS='=' read -r key value || [[ -n "$key" ]]; do
      if [[ ! -z "$key" && ! "$key" =~ ^# && ! "$key" =~ ^\[ ]]; then
        key=$(echo "$key" | xargs)
        value=$(echo "$value" | xargs)
        declare -g "$key"="$value"
      fi
    done < "$settings_file"
  else
    echo "Settings file not found: $settings_file"
    exit 1
  fi
}

# Function to perform the backup
backup() {
  local tag="$1"
  read_settings
  
  # Create backup directory if it doesn't exist
  backup_dest="$backup_root/$tag"
  mkdir -p "$backup_dest" 2>/dev/null
  
  # Rsync command for backup
  target="$user@$server:/home/$user/$backup_root/$tag"
  rsync_cmd="rsync -avh $source_dir $target"
  # If exclude_files is defined and not empty, add it to rsync command
  if [[ -n "$exclude_files" ]]; then
    rsync_cmd="rsync -avh --exclude='$exclude_files' $source_dir $target"
  fi
  echo "Command:$rsync_cmd"
  eval "$rsync_cmd"
  
  # Log the backup information
  log_path="$HOME/.backuplog"
  timestamp=$(date +"%Y-%m-%d %H:%M")
  echo "\"$tag\",$timestamp,$rsync_cmd,$timestamp" >> "$log_path"
  
  echo "Backup for '$tag' completed and logged."
}

# Function to remove the backup
remove_backup() {
  local tag="$1"
  read_settings
  
  # Rsync remove command
  rmfile="/home/$user/$backup_root/$tag"
  rm_cmd="ssh $user@$server rm -rf $rmfile"
  eval "$rm_cmd"
  
  # Remove log entr
newthrowaway20@lemmy.world on 29 Mar 15:18 next collapse

Do you wanna share a bash script, then?

waspentalive@lemmy.one on 30 Mar 16:34 collapse

Especially one that lets you know how long it’s been since you took time to run a backup, keeps track of which set of backups could be updated, and which should be refreshed, and keeps a log file up to date and in .csv format so you can mess with it in a spreadsheet?

demeaning_casually@infosec.pub on 30 Mar 21:44 collapse

#!/bin/bash
read_settings() {
  settings_file="$HOME/.config/loci/settings"
  if [[ -f "$settings_file" ]]; then
    while IFS='=' read -r key value || [[ -n "$key" ]]; do
      if [[ ! -z "$key" && ! "$key" =~ ^# && ! "$key" =~ ^\[ ]]; then
        key=$(echo "$key" | xargs)
        value=$(echo "$value" | xargs)
        declare -g "$key"="$value"
      fi
    done < "$settings_file"
  else
    echo "Settings file not found: $settings_file"
    exit 1
  fi
}

# Function to perform the backup
backup() {
  local tag="$1"
  read_settings
  
  log_path="$HOME/.backuplog"
  
  # Check if header exists in log file, if not, create it
  if [[ ! -f "$log_path" ]]; then
    echo "\"tag\",\"timestamp\",\"command\",\"completion_time\"" > "$log_path"
  elif [[ $(head -1 "$log_path") != "\"tag\",\"timestamp\",\"command\",\"completion_time\"" ]]; then
    # Add header if it doesn't exist
    temp_file=$(mktemp)
    echo "\"tag\",\"timestamp\",\"command\",\"completion_time\"" > "$temp_file"
    cat "$log_path" >> "$temp_file"
    mv "$temp_file" "$log_path"
  fi
  
  # Create backup directory if it doesn't exist
  backup_dest="$backup_root/$tag"
  mkdir -p "$backup_dest" 2>/dev/null
  
  # Rsync command for backup
  target="$user@$server:/home/$user/$backup_root/$tag"
  rsync_cmd="rsync -avh $source_dir $target"
  # If exclude_files is defined and not empty, add it to rsync command
  if [[ -n "$exclude_files" ]]; then
    rsync_cmd="rsync -avh --exclude='$exclude_files' $source_dir $target"
  fi
  
  echo "Starting backup for tag '$tag' at $(date '+%Y-%m-%d %H:%M:%S')"
  echo "Command: $rsync_cmd"
  
  # Record start time
  start_timestamp=$(date +"%Y-%m-%d %H:%M:%S")
  
  # Execute the backup
  eval "$rsync_cmd"
  backup_status=$?
  
  # Record completion time
  completion_timestamp=$(date +"%Y-%m-%d %H:%M:%S")
  
  # Calculate duration
  start_seconds=$(date -d "$start_timestamp" +%s)
  end_seconds=$(date -d "$completion_timestamp" +%s)
  duration=$((end_seconds - start_seconds))
  
  # Format d
waspentalive@lemmy.one on 31 Mar 03:37 collapse

Ah, Improvements!

waspentalive@lemmy.one on 29 Mar 15:18 next collapse

It’s also to help me learn python. And it works for me. : ^ )

newthrowaway20@lemmy.world on 29 Mar 15:20 next collapse

Don’t mind him. Any time someone shares code, there’s always someone else who did nothing talking about how much better your code could have been. Just noise from the peanut gallery.

waspentalive@lemmy.one on 29 Mar 15:32 collapse

Yeah, no problem… I started out with just bare rsync - but I did the backup infrequently and needed my notes to know the command. Then I wrote a simple shell script to run the rsync for me. Then I decided I needed more than one backup, redundancy is good. Then I wanted to keep track of the backups so I had it write to .backuplog then that file started getting dated (every time I run a “sun” backup the record of the previous one is useless) so Finally TaDa! loci is born.

rc__buggy@sh.itjust.works on 29 Mar 16:25 next collapse

lol, you’re braver than me. No one ever sees the “code” I’ve written.

waspentalive@lemmy.one on 30 Mar 00:39 collapse

That’s ok Like any landing you can walk away from. Any code that runs to spec is good, much could be better.

ChapulinColorado@lemmy.world on 31 Mar 01:00 collapse

Bash does seem like a better fit for this kind of script since it is a lot more portable.

I.e.: It comes by default for many Linux distributions. For windows, a Git bash install will get you most utilities needed for large reliable scripts (grep, scp, find, sort, uniq, cat, tr, ls, etc.).

With that said, you should write it on whatever language you want, especially if it is for learning purposes, that’s where the fun comes from :)

blaidd@jlai.lu on 29 Mar 23:00 next collapse

No need to be a dick

Artyom@lemm.ee on 30 Mar 09:22 next collapse

Can you please articulate why Python and Bash are so different in your eyes?

demeaning_casually@infosec.pub on 30 Mar 21:29 collapse

One needs to be compiled installed and the other is literally the de facto scripting language installed everywhere and intended for exactly this purpose.

Artyom@lemm.ee on 31 Mar 00:45 next collapse

Python does not need to be compiled, have you ever used it?

waspentalive@lemmy.one on 31 Mar 03:39 collapse

My system came with Python3 installed. Debian 12.

waspentalive@lemmy.one on 31 Mar 03:33 collapse

Looks like a line by line translation from the python. Will you use it to backup your home directory?

demeaning_casually@infosec.pub on 31 Mar 04:02 collapse

No.

It doesn’t really do anything I particularly need.

droolio@feddit.uk on 29 Mar 15:29 next collapse

Multiple backups may be kept.

Nice work, but if I may suggest - it lacks hardlink support, so’s quite wasteful in terms of disk space - the number of ‘tags’ (snapshots) will be extremely limited.

At least two robust solutions that use rsync+hardlinks already exist: rsnapshot.org and dirvish.org (both written in perl). There’s definitely room for backup tools that produce plain copies, instead of packed chunk data like restic and Duplicacy, and a python or even bash-based tool might be nice, so keep at it.

However, I liken backup software to encryption - extreme care must be taken when rolling and using your own. Whatever tool you use, test test test the backups. :)

waspentalive@lemmy.one on 30 Mar 17:47 collapse

@droolio@feddit.uk I see what you’re asking. You’re wondering if, instead of storing a duplicate file when another backup set already contains it, I could use a hardlink to point to the file already stored in that other set?

I have a system where I create a backup set for each day of the week. When I do a backup for that day, I update the set, or if it’s out of date, I replace it entirely with a fresh backup image (After 7 backups to that set). But if the backup sets became inter-dependent, removing or updating one set could lead to problems with others that rely on files in the first set.

Does that make sense? I am asking because I am not familiar with the utilities you mentioned and may be taking your post wrong.

Cobrachicken@lemmy.world on 29 Mar 18:39 collapse

Saved for trying out later, ty!