Warning: This document is for the development version of zrepl. The main version is stable.

Changelog

The changelog summarizes bugfixes that are deemed relevant for users and package maintainers. Developers should consult the git commit log or GitHub issue tracker.

We use the following annotations for classifying changes:

  • [CONFIG] Change that breaks the config. As a package maintainer, make sure to warn your users about config breakage somehow.
  • [BREAK] Change that breaks interoperability or persistent state representation with previous releases. As a package maintainer, make sure to warn your users about config breakage somehow. Note that even updating the package on both sides might not be sufficient, e.g. if persistent state needs to be migrated to a new format.
  • [MIGRATION] Migration that must be run by the user.
  • [FEATURE] Change that introduces new functionality.
  • [BUG] Change that fixes a bug, no regressions or incompatibilities expected.
  • [DOCS] Change to the documentation.
  • [MAINT] Maintenance changes.

0.4.0

  • [FEATURE] support setting zfs send / recv flags in the config (send: -wLcepbS , recv: -ox ). Config docs here and here .
  • [FEATURE] parallel replication is now configurable (disabled by default, config docs here ).
  • [FEATURE] New zrepl status UI:
    • Interactive job selection.
    • Interactively zrepl signal jobs.
    • Filter filesystems in the job view by name.
    • An approximation of the old UI is still included as –mode legacy but will be removed in a future release of zrepl.
  • [BUG] Actually use concurrency when listing zrepl abstractions & doing size estimation. These operations were accidentally made sequential in zrepl 0.3.
  • [BUG] Job hang-up during second replication attempt.
  • [BUG] Data races conditions in the dataconn rpc stack.
  • [MAINT] Update to protobuf v1.25 and grpc 1.35.

For users who skipped the 0.3.1 update: please make sure your pruning grid config is correct. The following bugfix in 0.3.1 caused problems for some users:

  • [BUG] pruning: grid: add all snapshots that do not match the regex to the rule’s destroy list.

Note

zrepl is a spare-time project primarily developed by Christian Schwarz.
You can support maintenance and feature development through one of the following services:
Donate via Patreon Donate via GitHub Sponsors Donate via Liberapay Donate via PayPal
Note that PayPal processing fees are relatively high for small donations.
For SEPA wire transfer and commercial support, please contact Christian directly.

0.3.1

Mostly a bugfix release for zrepl 0.3.

  • [FEATURE] pruning: add optional regex field to last_n rule
  • [DOCS] pruning: grid : improve documentation and add an example
  • [BUG] pruning: grid: add all snapshots that do not match the regex to the rule’s destroy list. This brings the implementation in line with the docs.
  • [BUG] easyrsa script in docs
  • [BUG] platformtest: fix skipping encryption-only tests on systems that don’t support encryption
  • [BUG] replication: report AttemptDone if no filesystems are replicated
  • [FEATURE] status + replication: warning if replication succeeeded without any filesystem being replicated
  • [DOCS] update multi-job & multi-host setup section
  • RPM Packaging
  • CI infrastructure rework
  • Continuous deployment of that new stable branch to zrepl.github.io.

0.3

This is a big one! Headlining features:

  • Resumable Send & Recv Support No knobs required, automatically used where supported.
  • Encrypted Send & Recv Support for OpenZFS native encryption, configurable at the job level, i.e., for all filesystems a job is responsible for.
  • Replication Guarantees Automatic use of ZFS holds and bookmarks to protect a replicated filesystem from losing synchronization between sender and receiver. By default, zrepl guarantees that incremental replication will always be possible and interrupted steps will always be resumable.

Tip

We highly recommend studying the updated overview section of the configuration chapter to understand how replication works.

Tip

Go 1.15 changed the default TLS validation policy to require Subject Alternative Names (SAN) in certificates. The openssl commands we provided in the quick-start guides up to and including the zrepl 0.3 docs seem not to work properly. If you encounter certificate validation errors regarding SAN and wish to continue to use your old certificates, start the zrepl daemon with env var GODEBUG=x509ignoreCN=0. Alternatively, generate new certificates with SANs (see both options int the TLS transport docs ).

Quick-start guides:

Additional changelog:

  • [BREAK] Go 1.15 TLS changes mentioned above.
  • [BREAK] [CONFIG] more restrictive job names than in prior zrepl versions Starting with this version, job names are going to be embedded into ZFS holds and bookmark names (see this section for details). Therefore you might need to adjust your job names. Note that jobs cannot be renamed easily once you start using zrepl 0.3.
  • [BREAK] [MIGRATION] replication cursor representation changed
    • zrepl now manages the replication cursor bookmark per job-filesystem tuple instead of a single replication cursor per filesystem. In the future, this will permit multiple sending jobs to send from the same filesystems.
    • ZFS does not allow bookmark renaming, thus we cannot migrate the old replication cursors.
    • zrepl 0.3 will automatically create cursors in the new format for new replications, and warn if it still finds ones in the old format.
    • Run zrepl migrate replication-cursor:v1-v2 to safely destroy old-format cursors. The migration will ensure that only those old-format cursors are destroyed that have been superseeded by new-format cursors.
  • [FEATURE] New option listen_freebind (tcp, tls, prometheus listener)
  • [FEATURE] issue #341 Prometheus metric for failing replications + corresponding Grafana panel
  • [FEATURE] issue #265 transport/tcp: support for CIDR masks in client IP whitelist
  • [FEATURE] documented subcommand to generate bash and zsh completions
  • [FEATURE] issue #307 chrome://trace -compatible activity tracing of zrepl daemon activity
  • [FEATURE] logging: trace IDs for better log entry correlation with concurrent replication jobs
  • [FEATURE] experimental environment variable for parallel replication (see issue #306 )
  • [BUG] missing logger context vars in control connection handlers
  • [BUG] improved error messages on zfs send errors
  • [BUG] [DOCS] snapshotting: clarify sync-up behavior and warn about filesystems
  • [BUG] transport/ssh: do not leak zombie ssh process on connection failures that will not be snapshotted until the sync-up phase is over
  • [DOCS] Installation: FreeBSD jail with iocage
  • [DOCS] Document new replication features in the config overview and replication/design.md.
  • [MAINTAINER NOTICE] New platform tests in this version, please make sure you run them for your distro!
  • [MAINTAINER NOTICE] Please add the shell completions to the zrepl packages.

0.2.1

  • [FEATURE] Illumos (and Solaris) compatibility and binary builds (thanks, MNX.io )
  • [FEATURE] 32bit binaries for Linux and FreeBSD (untested, though)
  • [BUG] better error messages in ssh+stdinserver transport
  • [BUG] systemd + ssh+stdinserver: automatically create /var/run/zrepl/stdinserver
  • [BUG] crash if Prometheus listening socket cannot be opened
  • [MAINTAINER NOTICE] Makefile refactoring, see commit 080f2c0

0.2

  • [FEATURE] Pre- and Post-Snapshot Hooks with built-in support for MySQL and Postgres checkpointing as well as custom scripts (thanks, @overhacked!)
  • [FEATURE] Use zfs destroy pool/fs@snap1,snap2,... CLI feature if available
  • [FEATURE] Linux ARM64 Docker build support & binary builds
  • [FEATURE] zrepl status now displays snapshotting reports
  • [FEATURE] zrepl status --job <JOBNAME> filter flag
  • [BUG] i386 build
  • [BUG] early validation of host:port tuples in config
  • [BUG] zrepl status now supports TERM=screen (tmux on FreeBSD / FreeNAS)
  • [BUG] ignore connection reset by peer errors when shutting down connections
  • [BUG] correct error messages when receive-side pool or root_fs dataset is not imported
  • [BUG] fail fast for misconfigured local transport
  • [BUG] race condition in replication report generation would crash the daemon when running zrepl status
  • [BUG] rpc goroutine leak in push mode if zfs recv fails on the sink side
  • [MAINTAINER NOTICE] Go modules for dependency management both inside and outside of GOPATH (lazy.sh and Makefile force GO111MODULE=on)
  • [MAINTAINER NOTICE] make platformtest target to check zrepl’s ZFS abstractions (screen scraping, etc.). These tests only work on a system with ZFS installed, and must be run as root because they create a file-backed pool for each test case. The pool name zreplplatformtest is reserved for this use case. Only run make platformtest on test systems, e.g. a FreeBSD VM image.

0.1.1

  • [BUG] issue #162 commit d6304f4 : fix I/O timeout errors on variable receive rate
    • A significant reduction or sudden stall of the receive rate (e.g. recv pool has other I/O to do) would cause a writev I/O timeout error after approximately ten seconds.

0.1

This release is a milestone for zrepl and required significant refactoring if not rewrites of substantial parts of the application. It breaks both configuration and transport format, and thus requires manual intervention and updates on both sides of a replication setup.

Danger

The changes in the pruning system for this release require you to explicitly define keep rules: for any snapshot that you want to keep, at least one rule must match. This is different from previous releases where pruning only affected snapshots with the configured snapshotting prefix. Make sure that snapshots to be kept or ignored by zrepl are covered, e.g. by using the regex keep rule. Learn more in the config docs…

Notes to Package Maintainers

  • Notify users about config changes and migrations (see changes attributed with [BREAK] and [MIGRATION] below)
  • If the daemon crashes, the stack trace produced by the Go runtime and possibly diagnostic output of zrepl will be written to stderr. This behavior is independent from the stdout outlet type. Please make sure the stderr output of the daemon is captured somewhere. To conserve precious stack traces, make sure that multiple service restarts do not directly discard previous stderr output.
  • Make it obvious for users how to set the GOTRACEBACK environment variable to GOTRACEBACK=crash. This functionality will cause SIGABRT on panics and can be used to capture a coredump of the panicking process. To that extend, make sure that your package build system, your OS’s coredump collection and the Go delve debugger work together. Use your build system to package the Go program in this tutorial on Go coredumps and the delve debugger , and make sure the symbol resolution etc. work on coredumps captured from the binary produced by your build system. (Special focus on symbol stripping, etc.)
  • Consider using the zrepl configcheck subcommand in startup scripts to abort a restart that would fail due to an invalid config.

Changes

  • [BREAK] [MIGRATION] Placeholder property representation changed
    • The placeholder property now uses on|off as values instead of hashes of the dataset path. This permits renames of the sink filesystem without updating all placeholder properties.
    • Relevant for 0.0.X-0.1-rc* to 0.1 migrations
    • Make sure your config is valid with zrepl configcheck
    • Run zrepl migrate 0.0.X:0.1:placeholder
  • [FEATURE] issue #55 : Push replication (see push job and sink job)
  • [FEATURE] TCP Transport
  • [FEATURE] TCP + TLS client authentication transport
  • [FEATURE] issue #111: RPC protocol rewrite
    • [BREAK] Protocol breakage; Update and restart of all zrepl daemons is required.
    • Use gRPC for control RPCs and a custom protocol for bulk data transfer.
    • Automatic retries for network-temporary errors
      • Limited to errors during replication for this release. Addresses the common problem of ISP-forced reconnection at night, but will become way more useful with resumable send & recv support. Pruning errors are handled per FS, i.e., a prune RPC is attempted at least once per FS.
  • [FEATURE] Proper timeout handling for the SSH transport
    • [BREAK] Requires Go 1.11 or later.
  • [BREAK] [CONFIG]: mappings are no longer supported
    • Receiving sides (pull and sink job) specify a single root_fs. Received filesystems are then stored per client in ${root_fs}/${client_identity}. See Jobs & How They Work Together for details.
  • [FEATURE] [BREAK] [CONFIG] Manual snapshotting + triggering of replication
    • [FEATURE] issue #69: include manually created snapshots in replication
    • [CONFIG] manual and periodic snapshotting types
    • [FEATURE] zrepl signal wakeup JOB subcommand to trigger replication + pruning
    • [FEATURE] zrepl signal reset JOB subcommand to abort current replication + pruning
  • [FEATURE] [BREAK] [CONFIG] New pruning system
    • The active side of a replication (pull or push) decides what to prune for both sender and receiver. The RPC protocol is used to execute the destroy operations on the remote side.
    • New pruning policies (see configuration documentation )
      • The decision what snapshots shall be pruned is now made based on keep rules
      • [FEATURE] issue #68: keep rule not_replicated prevents divergence of sender and receiver
    • [FEATURE] [BREAK] Bookmark pruning is no longer necessary
      • Per filesystem, zrepl creates a single bookmark (#zrepl_replication_cursor) and moves it forward with the most recent successfully replicated snapshot on the receiving side.
      • Old bookmarks created by prior versions of zrepl (named like their corresponding snapshot) must be deleted manually.
      • [CONFIG] keep_bookmarks parameter of the grid keep rule has been removed
  • [FEATURE] zrepl status for live-updating replication progress (it’s really cool!)
  • [FEATURE] Snapshot- & pruning-only job type (for local snapshot management)
  • [FEATURE] issue #67: Expose Prometheus metrics via HTTP (config docs)
    • Compatible Grafana dashboard shipping in dist/grafana
  • [CONFIG] Logging outlet types must be specified using the type instead of outlet key
  • [BREAK] issue #53: CLI: zrepl control * subcommands have been made direct subcommands of zrepl *
  • [BUG] Goroutine leak on ssh transport connection timeouts
  • [BUG] issue #81 issue #77 : handle failed accepts correctly (source job)
  • [BUG] issue #100: fix incompatibility with ZoL 0.8
  • [FEATURE] issue #115: logging: configurable syslog facility
  • [FEATURE] Systemd unit file in dist/systemd

Previous Releases

Note

Due to limitations in our documentation system, we only show the changelog since the last release and the time this documentation is built. For the changelog of previous releases, use the version selection in the hosted version of these docs at zrepl.github.io.