[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: rsync --delete



On 2020-10-16 15:09, Mike McClain wrote:
I've been using rsync to backup to a flash drive but it's not
performing exactly as I expected.

The man page says:
     --delete                delete extraneous files from dest dirs
A section of the backup script is so:
Params=(-a --inplace --delete);

I try to use all lower case letters for variable names and all upper case letters for constants.


'$Params' -- Variable, function, media, source, destination, etc., names are critical to understanding. I don't like 'Params' -- it's too generic. I would prefer 'rsync_opts'.


I use braces whenever evaluating a variable -- '${Params}'. I forget the details why, but I do recall that this practice is important.


I don't use Bourne arrays, and I barely understand how the shell interpolates lists and preserves items containing whitespace. When I can't figure it out, I switch to Perl.


-a -- I use this option for backups.


--inplace -- If your backup media cannot hold the last backup with enough room for the next backup, get bigger backup media.


--delete -- I use this option for backups.


Additional options I use in my rsync(1) backup script:

--one-file-system -- I use this option for backups. The bottom-level backup script is driven by a higher level script that backs up multiple filesystems on multiple hosts via configuration files.

-e ${SSH} -- where:

    SSH=/usr/bin/ssh


(To be pendantic, I should put single parentheses around the RHS?)


When doing rsync(1) backups over SSH, I seem to recall that rsync(1) will use whatever login shell is specified for that account. Some of my machines are FreeBSD, and the root login shell is tcsh(1). Strange things were happening before I discovered the '-e' / '--rsh' option.


Flash=/sda/rpi4b

Mixed case variable name -- as above.


Flash -- I would prefer 'backup_media'.


Is /sda the mount point for your backup media? If so, that is confusing -- 'sda' implies '/dev/sda', which should be your system drive (e.g. root). I would label the backup filesystem 'backup-rpi4b' and mount it at '/mnt/backup-rpi4b' or '/media/backup-rpi4b' (your desktop might be able to do this for you).


cd /home/mike

I prefer scripts that I can run from anywhere. The script should know where to do its work, either from built-in variables, environment variables, resource script, configuration file, command-line arguments/ options, etc.. Sophisticated scripts can draw this information from multiple resources, with some hierarchy of precedence. (I would switch to Perl if I wanted to get fancy with options.)


If the script must change the working directory, I would display that -- 'set -x', 'cd ...', and 'set +x'.


[ ! -d $Flash/mike ] && mkdir $Flash/mike;

I would set a variable for the destination directory.  Something like:

    dst="${Flash}/mike"


I would do an old-school 'if' block and display that a directory is being created -- 'set -x', 'mkdir ...', 'set +x'.


#   exclude compressed files and the contents of most of the .* directories
/mc/bin/mk_rsync_exclude.sh

What is /mc?


mk_rsync_exclude.sh creates '.rsync_exclude' in the current working directory?


A problem with dynamically generated code/ arguments/ options is: how do you know what code was actually run in the past?. Now you need a log file. I prefer to use a static configuration files and static scripts, and check everything into a version control system. My previous backup scripts generated log files, but my current rewrite does not have that feature (yet).


I prefer to keep my backup scripts, include files, etc., outside of the backup source and destination paths (except when backing up the filesystem that contains the backup stuff). This reduces confusion, and might prevent a chicken-and-egg situation.


echo /usr/bin/rsync $Params --exclude-from=/home/mike/.rsync_exclude . $Flash/mike
/usr/bin/rsync $Params --exclude-from=/home/mike/.rsync_exclude . $Flash/mike ||
     echo rsync $Params --exclude-from=/home/mike/.rsync_exclude . $Flash/mike    Failed $? ;


You cut and pasted the following code three times:

/usr/bin/rsync $Params --exclude-from=/home/mike/.rsync_exclude . $Flash/mike


DRY:

    https://en.wikipedia.org/wiki/Don%27t_repeat_yourself


I prefer 'set -x', 'command ...', and 'set +x' when I want to see what the shell is actually doing (which might not be the same output as 'echo ...').


I use 'set -e' at the top of my scripts so that the shell will stop and display an error message if a script command fails.


/usr/bin/rsync -- I also use absolute paths for tools. But, I put them into upper-case variables at the top of my script.


--exclude-from -- It is too easy to screw up exclude specifications and exclude a file you need. Therefore, I backup entire filesystems.


When invoking rsync(1), I make sure that SRC and DEST are directories, that their paths are absolute, and that their paths end with '/'. This prevents confusion and works as I expect.


     If I delete a file from my home directory then backup over last
week's copy the deleted file stays in the backup directory and these
build up over time.

That is another good reason not to use 'exclude' options.


     Am I misusing rsync or am I just not understanding how it works?

rsync(1) is a very flexible tool. I use only a small subset of its features, and I do so in a consistent manner. Thus, I can obtain the results I need and I have confidence that they are correct.


David


Reply to: