Local Snapshots + Offline Backup to an External Disk
This config example shows how we can use zrepl to make periodic snapshots of our local workstation and back it up to a zpool on an external disk which we occassionally connect.
The local snapshots should be taken every 15 minutes for pain-free recovery from CLI disasters (rm -rf /
and the like).
However, we do not want to keep the snapshots around for very long because our workstation is a little tight on disk space.
Thus, we only keep one hour worth of high-resolution snapshots, then fade them out to one per hour for a day (24 hours), then one per day for 14 days.
At the end of each work day, we connect our external disk that serves as our workstation’s local offline backup. We want zrepl to inspect the filesystems and snapshots on the external pool, figure out which snapshots were created since the last time we connected the external disk, and use incremental replication to efficiently mirror our workstation to our backup disk. Afterwards, we want to clean up old snapshots on the backup pool: we want to keep all snapshots younger than one hour, 24 for each hour of the first day, then 360 daily backups.
A few additional requirements:
Snapshot creation and pruning on our workstation should happen in the background, without interaction from our side.
However, we want to explicitly trigger replication via the command line.
We want to use OpenZFS native encryption to protect our data on the external disk. It is absolutely critical that only encrypted data leaves our workstation. zrepl should provide an easy config knob for this and prevent replication of unencrypted datasets to the external disk.
We want to be able to put off the backups for more than three weeks, i.e., longer than the lifetime of the automatically created snapshots on our workstation. zrepl should use bookmarks and holds to achieve this goal.
When we yank out the drive during replication and go on a long vacation, we do not want the partially replicated snapshot to stick around as it would hold on to too much disk space over time. Therefore, we want zrepl to deviate from its default behavior and sacrifice resumability, but nonetheless retain the ability to do incremental replication once we return from our vacation. zrepl should provide an easy config knob to disable step holds for incremental replication.
The following config snippet implements the setup described above. You will likely want to customize some aspects mentioned in the top comment in the file.
# This config serves as an example for a local zrepl installation that
# backups the entire zpool `system` to `backuppool/zrepl/sink`
#
# The requirements covered by this setup are described in the zrepl documentation's
# quick start section which inlines this example.
#
# CUSTOMIZATIONS YOU WILL LIKELY WANT TO APPLY:
# - adjust the name of the production pool `system` in the `filesystems` filter of jobs `snapjob` and `push_to_drive`
# - adjust the name of the backup pool `backuppool` in the `backuppool_sink` job
# - adjust the occurences of `myhostname` to the name of the system you are backing up (cannot be easily changed once you start replicating)
# - make sure the `zrepl_` prefix is not being used by any other zfs tools you might have installed (it likely isn't)
jobs:
# this job takes care of snapshot creation + pruning
- name: snapjob
type: snap
filesystems: {
"system<": true,
}
# create snapshots with prefix `zrepl_` every 15 minutes
snapshotting:
type: periodic
interval: 15m
prefix: zrepl_
pruning:
keep:
# fade-out scheme for snapshots starting with `zrepl_`
# - keep all created in the last hour
# - then destroy snapshots such that we keep 24 each 1 hour apart
# - then destroy snapshots such that we keep 14 each 1 day apart
# - then destroy all older snapshots
- type: grid
grid: 1x1h(keep=all) | 24x1h | 14x1d
regex: "^zrepl_.*"
# keep all snapshots that don't have the `zrepl_` prefix
- type: regex
negate: true
regex: "^zrepl_.*"
# This job pushes to the local sink defined in job `backuppool_sink`.
# We trigger replication manually from the command line / udev rules using
# `zrepl signal wakeup push_to_drive`
- type: push
name: push_to_drive
connect:
type: local
listener_name: backuppool_sink
client_identity: myhostname
filesystems: {
"system<": true
}
send:
encrypted: true
replication:
protection:
initial: guarantee_resumability
# Downgrade protection to guarantee_incremental which uses zfs bookmarks instead of zfs holds.
# Thus, when we yank out the backup drive during replication
# - we might not be able to resume the interrupted replication step because the partially received `to` snapshot of a `from`->`to` step may be pruned any time
# - but in exchange we get back the disk space allocated by `to` when we prune it
# - and because we still have the bookmarks created by `guarantee_incremental`, we can still do incremental replication of `from`->`to2` in the future
incremental: guarantee_incremental
snapshotting:
type: manual
pruning:
# no-op prune rule on sender (keep all snapshots), job `snapshot` takes care of this
keep_sender:
- type: regex
regex: ".*"
# retain
keep_receiver:
# longer retention on the backup drive, we have more space there
- type: grid
grid: 1x1h(keep=all) | 24x1h | 360x1d
regex: "^zrepl_.*"
# retain all non-zrepl snapshots on the backup drive
- type: regex
negate: true
regex: "^zrepl_.*"
# This job receives from job `push_to_drive` into `backuppool/zrepl/sink/myhostname`
- type: sink
name: backuppool_sink
root_fs: "backuppool/zrepl/sink"
serve:
type: local
listener_name: backuppool_sink
Offline Backups with two (or more) External Disks
It can be desirable to have multiple disk-based backups of the same machine. To accomplish this,
create one zpool per external HDD, each with a unique name, and
define a pair of
push
andsink
job for each of these zpools, each with a uniquename
,listener_name
, androot_fs
.
The unique names ensure that the jobs don’t step on each others’ toes when managing zrepl’s ZFS abstractions .
Click here to go back to the quickstart guide.