Warning: This document is for an old version of zrepl. The main version is master.

Job Types

A job is the unit of activity tracked by the zrepl daemon and configured in the main configuration file. Every job has a unique name, a type and type-dependent fields which are documented on this page. Check out the Tutorial and config/samples/ for examples on how job types are actually used.

Attention

Currently, zrepl does not replicate filesystem properties. Whe receiving a filesystem, it is never mounted (-u flag) and mountpoint=none is set. This is temporary and being worked on issue #24.

Source Job

Example: config/samplespullbackup/productionhost.yml.

Parameter Comment
type = source
name unique name of the job
serve serve transport
filesystems filter for filesystems to expose to client
snapshot_prefix prefix for ZFS snapshots taken by this job
interval snapshotting interval
prune prune policy for filesytems in filesystems with prefix snapshot_prefix
  • Snapshotting Task (every interval, patient)
    • A snapshot of filesystems matched by filesystems is taken every interval with prefix snapshot_prefix.
    • The prune policy is triggered on filesystems matched by filesystems with snapshots matched by snapshot_prefix.
  • Serve Task
    • Wait for connections from pull job using serve.

A source job is the counterpart to a Pull Job.

Note that the prune policy determines the maximum replication lag: a pull job may stop replication due to link failure, misconfiguration or administrative action. The source prune policy will eventually destroy the last common snapshot between source and pull job, requiring full replication. Make sure you read the prune policy documentation.

Pull Job

Example: config/samplespullbackup/backuphost.yml

Parameter Comment
type = pull
name unqiue name of the job
connect connect transport
interval Interval between pull attempts
mapping mapping for remote to local filesystems
initial_repl_policy default = most_recent, initial replication policy
snapshot_prefix prefix snapshots must match to be considered for replication & pruning
prune prune policy for local filesystems reachable by mapping
  • Main Task (every interval, patient)
    1. A connection to the remote source job is established using the strategy in connect
    2. mapping maps filesystems presented by the remote side to local target filesystems
    3. Those remote filesystems with a local target filesystem are replicated
      1. Only snapshots with prefix snapshot_prefix are replicated.
      2. If possible, incremental replication takes place.
      3. If the local target filesystem does not exist, initial_repl_policy is used.
      4. On conflicts, an error is logged but replication of other filesystems with mapping continues.
    4. The prune policy is triggered for all target filesystems

A pull job is the counterpart to a Source Job.

Local Job

Example: config/sampleslocalbackup/host1.yml

Parameter Comment
type = local
name unqiue name of the job
mapping mapping from source to target filesystem (both local)
snapshot_prefix prefix for ZFS snapshots taken by this job
interval snapshotting & replication interval
initial_repl_policy default = most_recent, initial replication policy
prune_lhs pruning policy on left-hand-side (source)
prune_rhs pruning policy on right-hand-side (target)
  • Main Task (every interval, patient)
    1. Evaluate mapping for local filesystems, those with a target filesystem are called mapped filesystems.
    2. Snapshot mapped filesystems with snapshot_prefix.
    3. Replicate mapped filesystems to their respective target filesystems:
      1. Only snapshots with prefix snapshot_prefix are replicated.
      2. If possible, incremental replication takes place.
      3. If the target filesystem does not exist, initial_repl_policy is used.
      4. On conflicts, an error is logged but replication of other mapped filesystems continues.
    4. The prune_lhs policy is triggered for all mapped filesystems
    5. The prune_rhs policy is triggered for all target filesystems

A local job is combination of source & pull job executed on the same machine.

Terminology

task

A job consists of one or more tasks and a task consists of one or more steps. Some tasks may be periodic while others wait for an event to occur.

patient task

A patient task is supposed to execute some task every interval. We call the start of the task an invocation.

  • If the task completes in less than interval, the task is restarted at last_invocation + interval.
  • Otherwise, a patient job
    • logs a warning as soon as a task exceeds its configured interval
    • waits for the last invocation to finish
    • logs a warning with the effective task duration
    • immediately starts a new invocation of the task