Restic Repository Maintenance: Prune, Check, Forget

Your Repo Is 800 GB and `check` Just Timed Out

It starts innocently. You set up restic, pointed it at Backblaze, wrote a cron job, and forgot about it. Exactly as the gods of backup intended.

Then one day you check on it, because something feels wrong and you’re a responsible sysadmin, or because your bill went up, and you’re staring at a repo that has ballooned to 800 GB. You run restic check to make sure things are healthy before you touch anything. Four hours later, your SSH session times out and you still don’t have an answer.

Welcome to restic repository maintenance. It’s not glamorous, but neither is restoring from a corrupt backup at 2 AM.

Let’s talk about what actually needs to happen, in what order, and how to automate it so future-you doesn’t end up here again.

The Lifecycle: What Actually Happens to Your Data

Restic stores backup data in pack files inside the repo. When you run restic backup, it:

Chunks your files using content-defined chunking (CDC)
Deduplicates chunks that already exist in the repo
Writes new chunks to pack files
Creates a snapshot: basically a JSON manifest pointing to all the chunks that make up your backup at that point in time

The problem is restic never deletes anything automatically. Every snapshot you’ve ever taken is still there. Every chunk those snapshots reference is still stored. Your repo only grows.

This is where the trio comes in:

forget: marks old snapshots for deletion based on your retention policy. Doesn’t actually delete any data yet.
prune: goes through the pack files, figures out which chunks are no longer referenced by any snapshot, and removes them.
check: verifies repo integrity. Reads the index, checks that all referenced chunks exist, optionally reads and verifies actual data.

The order matters: forget, then prune, then check. Running check before prune is fine for sanity checks, but don’t expect it to tell you what prune will clean up.

`forget`: Teaching Restic What to Keep

restic forget without any flags does absolutely nothing useful, it lists snapshots. You need to tell it your retention policy:

restic forget \
  --keep-daily 7 \
  --keep-weekly 4 \
  --keep-monthly 6 \
  --keep-yearly 2 \
  --prune

This keeps:

One snapshot per day for the last 7 days
One per week for the last 4 weeks
One per month for the last 6 months
One per year for the last 2 years

The --prune flag tells forget to immediately run prune after marking snapshots. Convenient, but you lose some control. I’ll explain why you might want to keep them separate in a minute.

Before you commit to a policy, run it in dry-run mode:

restic forget \
  --keep-daily 7 \
  --keep-weekly 4 \
  --keep-monthly 6 \
  --keep-yearly 2 \
  --dry-run

You’ll see exactly which snapshots would be removed. Look at this output carefully. If you set --keep-daily 7 but you only run backups weekly, restic will happily keep just your last 7 successful weekly snapshots, not 7 days’ worth. The flags are about snapshot count within time windows, not calendar duration. This trips people up constantly.

Tags and Hosts

If you’re backing up multiple machines or datasets to the same repo, scope your forget with --tag or --host:

restic forget \
  --host homelab-nas \
  --keep-daily 7 \
  --keep-weekly 4 \
  --prune

Without this, your retention policy might eat snapshots from different hosts than you intended. Check restic snapshots to see what tags and hosts you’re working with.

`prune`: The Part That Actually Frees Space

Once forget has marked snapshots as removed, prune cleans up the orphaned data. This is the expensive operation, it has to read the index, figure out what’s referenced, and rewrite pack files that are only partially used.

Basic prune:

restic prune

Modern restic (v0.14+) supports --max-unused to control how aggressive the cleanup is:

restic prune --max-unused 5%

This tells restic to leave up to 5% of the repo as unused data (unreferenced chunks sitting in pack files that aren’t worth rewriting yet). Lower threshold = more thorough cleanup = more time and bandwidth. The default is 5%, which is sane for most setups.

v0.17/0.18+ Is a Big Deal

If you’re on an older restic release, upgrade. Seriously.

Restic v0.17 introduced --repack-uncompressed which rewrites pack files that were created before compression was supported (restic added compression in v0.14). If you started your repo before that, you’ve got a mix of compressed and uncompressed packs. This flag fixes it:

restic prune --repack-uncompressed

v0.17 and v0.18 also improved prune performance by reducing the amount of index data that needs to be loaded into memory. If you’ve been avoiding prune because it OOMs your NAS, try again with the latest version.

`check`: The Right Way to Not Destroy Your Weekend

Here’s what most people do wrong: they run restic check --read-data on an 800 GB repo stored on Backblaze B2. This downloads every byte of your backup to verify it. At typical B2 download speeds, this takes forever and costs you money.

Don’t do that.

Instead, use subset checking:

restic check --read-data-subset=1G/10

This reads 1/10 of the pack data, using a deterministic selection that cycles through different portions on each run. Run this weekly in your automation, and over 10 weeks you’ve verified the entire repo without ever hammering your bandwidth in a single shot.

For the paranoid (and you should be slightly paranoid about backups), there’s also:

restic check --read-data-subset=10%

Same idea, percentage-based. Pick whichever makes more intuitive sense to you.

The check without --read-data or --read-data-subset still verifies the index and that all pack files referenced by snapshots actually exist, it just doesn’t verify the contents of those pack files. It’s fast and catches the most common failure modes (missing files, corrupt index). Do this after every prune.

The Remote Is Slow and Prune Is Painful

If your repo lives on a remote backend, Backblaze B2, S3, SFTP, whatever, prune is going to be painful. It has to download pack files, figure out which chunks to keep, then re-upload modified packs and delete the originals. On a slow connection, forget plus prune on a 500 GB repo can take six hours.

A few ways to make this hurt less:

Option 1: REST Server as a Local Cache

Run rest-server locally and use it as your primary backup target. It’s a lightweight HTTP server that implements the restic REST backend protocol. Fast local I/O, then you sync to remote storage separately.

docker run -p 8000:8000 \
  -v /mnt/backup/restic-rest:/data \
  restic/rest-server \
  --no-auth \
  --append-only

Note the --append-only flag, this enables append-only mode, which means the server refuses any DELETE operations on pack files. Even if ransomware gets your backup client and knows your restic password, it can’t delete your backup data through the REST API. You need direct server access to run prune. This is the right tradeoff for most home lab setups.

Set your repo URL to rest:http://localhost:8000/myrepo and now prune runs entirely locally, fast.

Option 2: `rclone serve restic`

If your data is already on cloud storage and you don’t want to move it, rclone can serve a local REST interface backed by your remote:

rclone serve restic \
  --addr localhost:8000 \
  b2:my-backup-bucket/restic

Then point restic at rest:http://localhost:8000/. You get local-feeling prune operations that transfer data through rclone’s connection pooling and caching. Not as fast as true local storage, but much better than raw SFTP or native B2 API.

Option 3: Rethink Your Prune Frequency

Prune doesn’t need to run after every backup. Run forget (without --prune) after every backup to mark stale snapshots, then run prune once a week or month. The repo will grow slightly between prune runs due to unreferenced data, but your daily backup windows stay short.

Cache Directory: The Silent Performance Killer

Restic uses a local cache at ~/.cache/restic/ (or $XDG_CACHE_HOME/restic/) to store index data and pack metadata locally. This cache makes repeated operations on the same repo faster.

The problem: on remote repos, the cache can get stale, or it can grow unbounded over time. If restic is complaining about mismatched cache data, blow it out:

restic cache --cleanup

Or nuke it entirely:

restic cache --no-cache  # disable for a single run

When running restic in scripts or containers that don’t have persistent home directories, set RESTIC_CACHE_DIR explicitly:

export RESTIC_CACHE_DIR=/var/cache/restic

Put this in your systemd service environment or cron environment. Without it, restic will rebuild the cache from scratch on every run, which is expensive on remote repos.

Automation: Systemd Timer + Healthchecks

Cron jobs are fine, but systemd timers give you better logging and don’t silently disappear if the system reboots mid-job.

Here’s a complete example. First, the environment file (keep your restic secrets out of the unit files):

RESTIC_REPOSITORY=b2:my-backup-bucket:/restic
RESTIC_PASSWORD_FILE=/etc/restic/password
AWS_ACCESS_KEY_ID=your_b2_key_id
AWS_SECRET_ACCESS_KEY=your_b2_app_key
B2_ACCOUNT_ID=your_b2_account_id
B2_ACCOUNT_KEY=your_b2_app_key
HEALTHCHECK_URL=https://hc-ping.com/your-uuid-here

The backup service:

[Unit]
Description=Restic backup
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
EnvironmentFile=/etc/restic/env
ExecStartPre=/bin/sh -c 'curl -fsS -m 10 --retry 5 -o /dev/null "${HEALTHCHECK_URL}/start"'
ExecStart=/usr/local/bin/restic backup \
  --one-file-system \
  --tag systemd \
  /home /etc /var/lib
ExecStartPost=/bin/sh -c 'curl -fsS -m 10 --retry 5 -o /dev/null "${HEALTHCHECK_URL}"'
ExecStopPost=/bin/sh -c 'if [ "$EXIT_STATUS" != "0" ]; then curl -fsS -m 10 --retry 5 -o /dev/null "${HEALTHCHECK_URL}/fail"; fi'

The maintenance service:

[Unit]
Description=Restic maintenance (forget, prune, check)
After=network-online.target
Wants=network-online.target

[Service]
Type=oneshot
EnvironmentFile=/etc/restic/env
ExecStart=/usr/local/bin/restic forget \
  --tag systemd \
  --keep-daily 7 \
  --keep-weekly 4 \
  --keep-monthly 6 \
  --keep-yearly 2
ExecStart=/usr/local/bin/restic prune --max-unused 5%
ExecStart=/usr/local/bin/restic check --read-data-subset=1G/10
ExecStartPost=/bin/sh -c 'curl -fsS -m 10 --retry 5 -o /dev/null "${HEALTHCHECK_URL}"'

The timer to run maintenance weekly:

[Unit]
Description=Weekly restic maintenance

[Timer]
OnCalendar=Sun 03:00
RandomizedDelaySec=30min
Persistent=true

[Install]
WantedBy=timers.target

Enable it all:

systemctl daemon-reload
systemctl enable --now restic-backup.timer
systemctl enable --now restic-maintenance.timer

Check your timers are registered:

systemctl list-timers restic*

The RandomizedDelaySec spreads the start time by up to 30 minutes, which helps if you have multiple machines hitting the same backend.

Migrating to a New Backend

Your needs change. Maybe you started on SFTP to a Raspberry Pi in your closet and now you want to move to S3-compatible storage (Backblaze B2 with S3 API, Wasabi, Cloudflare R2). Or you’re consolidating repos.

Restic doesn’t have a native repo migration command, but the pattern is simple:

# Mount old repo
restic -r sftp:oldhost:/backup/restic mount /mnt/restic-old &

# Restore everything from old, back up to new
restic -r s3:s3.us-west-004.backblazeb2.com/my-new-bucket backup /mnt/restic-old/snapshots

This is a full re-upload, so plan for bandwidth and time accordingly. The alternative is using restic copy which was introduced in v0.10:

restic -r s3:s3.us-west-004.backblazeb2.com/my-new-bucket \
  copy \
  --from-repo sftp:oldhost:/backup/restic \
  --from-password-file /etc/restic/old-password

copy transfers snapshots between repos without fully restoring them, which is much more efficient. It preserves snapshot metadata, deduplication, and compression.

Restic vs Kopia vs Borg: The Maintenance Angle

You’re using restic, but let’s briefly acknowledge the competition since you might be thinking about it.

Kopia handles garbage collection automatically in the background, you don’t have a manual forget + prune cycle. It also has a UI and does maintenance-in-backup rather than as a separate step. If the restic maintenance ceremony annoys you, Kopia is worth a look. The tradeoff is a more complex mental model and a repo format that’s less widely supported by third-party tooling.

Borg uses borg prune + borg compact (compact was added later to actually free space after prune). Borgmatic, by the way, isn’t a successor, it’s a YAML-driven wrapper that orchestrates plain Borg, and Borg has no native S3 backend (you bolt one on with rclone or a FUSE mount). The compact step trips up people who migrate from older Borg versions. Borg is faster than restic for repos on local or SFTP storage because it doesn’t have restic’s cloud-friendly chunking overhead. For remote cloud storage, restic generally wins.

For most home lab setups with cloud backends: stick with restic. The tooling is mature, the community is large, and v0.17+ has closed most of the performance gaps that used to make people reach for alternatives.

The Bottom Line

Restic is a great backup tool that will absolutely eat your storage budget if you ignore it for a year. The maintenance loop is not optional.

Run it in this order: forget (with your retention policy), prune (to actually free space), check --read-data-subset (to verify without downloading the world). Automate it with a systemd timer, ping healthchecks.io so you know when it fails, and set --append-only on your rest-server if ransomware protection matters to you (it should).

Upgrade to restic v0.17 or newer if you haven’t, faster prune, --repack-uncompressed, and better memory usage are all worth it.

Your 2 AM self will thank you for the 15 minutes you spent setting this up correctly today.

Restic Repository Maintenance: Prune, Check, Forget

Your Repo Is 800 GB and `check` Just Timed Out

The Lifecycle: What Actually Happens to Your Data

`forget`: Teaching Restic What to Keep

Tags and Hosts

`prune`: The Part That Actually Frees Space

v0.17/0.18+ Is a Big Deal

`check`: The Right Way to Not Destroy Your Weekend

The Remote Is Slow and Prune Is Painful

Option 1: REST Server as a Local Cache

Option 2: `rclone serve restic`

Option 3: Rethink Your Prune Frequency

Cache Directory: The Silent Performance Killer

Automation: Systemd Timer + Healthchecks

Migrating to a New Backend

Restic vs Kopia vs Borg: The Maintenance Angle

The Bottom Line

Responses from around the web

Discussion

Related Posts

Backblaze B2 + rclone: Tiered Backup at Real-World Costs

Snapper for Btrfs Snapshots on Root Filesystems

Kopia Repository Server: Multi-Host Backups Done Right

Borgmatic: Borg Backup, Done Right

Restic Repository Maintenance: Prune, Check, Forget

Your Repo Is 800 GB and check Just Timed Out

The Lifecycle: What Actually Happens to Your Data

forget: Teaching Restic What to Keep

Tags and Hosts

prune: The Part That Actually Frees Space

v0.17/0.18+ Is a Big Deal

check: The Right Way to Not Destroy Your Weekend

The Remote Is Slow and Prune Is Painful

Option 1: REST Server as a Local Cache

Option 2: rclone serve restic

Option 3: Rethink Your Prune Frequency

Cache Directory: The Silent Performance Killer

Automation: Systemd Timer + Healthchecks

Migrating to a New Backend

Restic vs Kopia vs Borg: The Maintenance Angle

The Bottom Line

Related Reading

Responses from around the web

Discussion

Related Posts

Backblaze B2 + rclone: Tiered Backup at Real-World Costs

Snapper for Btrfs Snapshots on Root Filesystems

Kopia Repository Server: Multi-Host Backups Done Right

Borgmatic: Borg Backup, Done Right

Your Repo Is 800 GB and `check` Just Timed Out

`forget`: Teaching Restic What to Keep

`prune`: The Part That Actually Frees Space

`check`: The Right Way to Not Destroy Your Weekend

Option 2: `rclone serve restic`