The Compression Triangle Nobody Wanted
For most of the 2010s, Linux compression was a depressing trade-off between three bad options.
gzip — Fast enough to not hate your life, but the compression ratio is basically “well, it’s smaller than before.” You’ll get maybe 60-65% of the original size on typical data. The upside: every system on Earth has gunzip. You will never be stranded.
bzip2 — Better ratio than gzip. Also 3-5x slower. A bzip2 backup job on a big dataset is the kind of thing you start before lunch and check on after your afternoon standup. If the ratio gain was worth the wait… honestly, usually it wasn’t.
xz — The nuclear option. Excellent compression ratios, approaching the theoretical ceiling for general-purpose compression. Also painfully, agonizingly slow. We’re talking single-threaded, CPU-maxed, “maybe tomorrow” territory for large datasets. You used it for distribution tarballs and release artifacts — things you compress once and decompress occasionally. Never for anything time-sensitive.
Every choice was a compromise. Then 2016 happened.
Facebook Dropped a Gift on the World
Yann LeCun’s team at Facebook Research released Zstandard (zstd) in 2016 and it basically embarrassed every other algorithm. The claim seemed too good to be true: compression speed matching gzip, with ratios approaching xz.
It wasn’t too good to be true.
Here’s a real benchmark on a 500MB mixed dataset (log files, JSON, some binaries):
| Algorithm | Compress Time | Decompress Time | Ratio |
|---|---|---|---|
| gzip | 8.2s | 2.1s | 64.2% |
| bzip2 | 38.4s | 14.6s | 58.1% |
| xz | 142s | 8.3s | 48.9% |
| zstd -3 | 7.1s | 1.4s | 58.8% |
| zstd -19 | 89s | 1.5s | 49.3% |
Read that again. zstd -3 is faster than gzip to compress and decompress, with a better ratio than bzip2. And zstd -19 gets you near-xz territory in about 60% of the time.
Decompression speed across all zstd levels is essentially flat. You pay the cost at compress time, not decompress time. That’s a huge deal for anything you compress once and restore under pressure.
Who Already Made the Switch
You’re probably already using zstd without realizing it.
The Linux kernel switched to zstd for module compression. Distros that used to ship .ko.xz modules now ship .ko.zst. Boot times improved. Nobody complained.
Docker uses zstd for image layer compression in newer registries. Pulls are faster.
npm switched to zstd for package tarballs. Your npm install got a quiet speedup.
Facebook’s internal infrastructure — where this thing was born — runs zstd pervasively across their data pipeline. When your compression library comes from people dealing with petabyte-scale data, it’s been tested.
Btrfs and ZFS both support zstd as a transparent filesystem compression algorithm, and it outperforms lzo and zlib on basically every workload.
Basic Usage — It’s Exactly What You’d Expect
# Compress a filezstd file.txt# Creates file.txt.zst, keeps original
# Decompressunzstd file.txt.zst# or: zstd -d file.txt.zst
# Compress in-place (remove original)zstd --rm file.txt
# Decompress to stdoutzstd -dc file.txt.zst | lessThe .zst extension. Clean, unambiguous. No legacy baggage.
Level Tuning — This Is Where It Gets Interesting
zstd has 22 compression levels. Default is 3.
# Default (level 3) — gzip speed, better ratiozstd file.tar
# Faster than default, slightly worse ratiozstd -1 file.tar
# Middle groundzstd -9 file.tar
# High compression (slower)zstd -19 file.tar
# Ludicrous modezstd --ultra -22 file.tarFor most backup and transfer use cases, level 3 is the right answer. You’re already beating gzip’s ratio at gzip’s speed. Levels 1-3 are everyday drivers. Levels 10-19 are for “I have time and I want the ratio.” Level 22 is for when you enjoy suffering slightly.
Parallel Compression — The Part bzip2 Wishes It Had
This is the one that genuinely changes the math on large datasets:
# Use all available CPU coreszstd -T0 bigarchive.tar
# Use 4 cores explicitlyzstd -T4 bigarchive.tar
# High compression + all coreszstd -19 -T0 massive.tar-T0 detects and uses all logical cores automatically. On an 8-core machine, you’re potentially 6-7x faster on CPU-bound workloads. bzip2’s parallel version (pbzip2) exists but it’s a separate binary and not universally available. xz has pixz. zstd just… does it natively.
Every time I see someone running a bzip2 backup job in 2026 I want to send them this article.
tar + zstd — The Archive You Actually Want
tar got native zstd support in version 1.31 (2017). Most modern Linux systems have it.
# Create archivetar --zstd -cvf archive.tar.zst directory/
# Extracttar --zstd -xvf archive.tar.zst
# Alternative: pipe to zstd manually (works on older tar)tar -cvf - directory/ | zstd -T0 > archive.tar.zst
# Extract the manual wayzstd -dc archive.tar.zst | tar -xvf -
# Check your tar version supports --zstdtar --versionIf --zstd isn’t available on your system, the pipe approach works everywhere. The -I flag lets you specify the compression program:
tar -I zstd -cvf archive.tar.zst directory/tar -I "zstd -19 -T0" -cvf archive.tar.zst directory/Streaming — Ship Data Faster
This is where zstd’s decompression speed shines. Compress on one end, decompress on the other, minimal CPU overhead at the destination:
# Copy a large file over SSH with compressioncat bigfile | zstd | ssh remotehost "zstd -d > /destination/bigfile"
# Database dump over the wirepg_dump mydb | zstd | ssh remotehost "zstd -d | psql targetdb"
# Or with tar, streaming a directorytar -cvf - /data/dir | zstd -T0 | ssh remotehost "zstd -d | tar -xvf - -C /restore/"The fast decompress speed means the receiving end barely notices the CPU cost. You’re essentially getting compression for free on the destination side.
When to Still Use the Old Formats
Look, I’m not here to tell you to rewrite all your scripts tonight. Here’s the honest decision table:
| Use Case | Use This | Why |
|---|---|---|
| Everyday backups | zstd -3 | Fast, good ratio, parallel |
| Maximum archival compression | zstd -19 or xz | Both excellent, zstd faster to decompress |
| HTTP transfer encoding | gzip | Universal browser/server support |
| Log files you rotate | zstd | Fast enough to not add cron lag |
| Distribution tarballs | xz | Universal expectation, tooling |
| Sharing with random external systems | gzip | Every system has gunzip, zero risk |
| Filesystem-level compression (Btrfs/ZFS) | zstd | Built in, best speed/ratio balance |
| Docker images | zstd | Already the default in modern registries |
Keep using gzip when: compatibility with old or external systems is a hard requirement. HTTP Content-Encoding: gzip is the lingua franca of the web. That’s not changing.
Keep using xz when: you’re making a tarball that’ll go on a distro mirror and you need to hit the smallest possible size. One-time compress, rare decompress — xz’s pain point disappears.
Stop using bzip2 when: now. Today. Unless something explicitly requires it. There is no scenario where bzip2 beats zstd on both ratio and speed simultaneously. It doesn’t exist.
One Install, Zero Regrets
# Debian/Ubuntusudo apt install zstd
# RHEL/Fedora/Rockysudo dnf install zstd
# Archsudo pacman -S zstd
# Check versionzstd --versionMost modern distros already have it. You might already be done.
The compression landscape finally has a sensible default. zstd didn’t win by being flashy — it won by solving the actual problem: nobody wants to choose between speed and ratio. Facebook’s engineers said “what if you didn’t have to?” and then proved it.
Your 2 AM restore job will thank you.