The Existential Crisis of Container Data
Here’s a fun experiment: spin up a Postgres container, create a database, insert some rows, feel good about yourself, then run docker rm on that container. Congratulations — your data is gone. Evaporated. Reduced to atoms. That table you spent twenty minutes designing? Never existed. Docker containers are like goldfish with amnesia: the moment they stop, they forget everything.
This is by design, of course. Containers are supposed to be ephemeral, disposable, cattle-not-pets. But your data? Your data is very much a pet. You named it. You fed it migrations. You’d be devastated if it disappeared.
So Docker gives you a few ways to make data outlive the container that created it. The problem is, there are multiple ways to do this, they behave differently, and picking the wrong one will have you debugging permissions at 2 AM while questioning every decision that led you to this career.
Let’s sort it out.
The Four Horsemen of Docker Storage
Docker provides four storage mechanisms for getting data in and out of containers:
- Named Volumes — Docker manages the storage, you manage the name
- Anonymous Volumes — Docker manages everything, including the incomprehensible name
- Bind Mounts — You point directly at a folder on your host machine
- tmpfs Mounts — Data lives in memory and vanishes when the container stops
Each has its place, and each has its “gotcha” moments. Let’s break them down.
Named Volumes: The Responsible Adult
Named volumes are Docker’s recommended way to persist data. You give the volume a name, Docker stores the data somewhere on the host filesystem (usually /var/lib/docker/volumes/ on Linux), and you don’t have to think about exactly where. It’s like a storage unit — you rent the space, you get a key, and you don’t need to know the building’s floor plan.
Creating and Using Named Volumes
# Create a named volume
docker volume create my-postgres-data
# Use it in a container
docker run -d \
--name my-db \
-v my-postgres-data:/var/lib/postgresql/data \
-e POSTGRES_PASSWORD=supersecret \
postgres:16
Or in Docker Compose, which is how most sane people do it:
services:
db:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_PASSWORD: supersecret
volumes:
pgdata:
That volumes: section at the bottom is important. It declares pgdata as a named volume. Without it, Docker Compose would create an anonymous volume, and you’d be back to playing “where did my data go?”
Why Named Volumes Are Great
- Docker manages the lifecycle. You don’t need to worry about host paths or directory structures.
- They work across platforms. Same Compose file works on Linux, Mac, and Windows without path headaches.
- They survive
docker compose down. Your data sticks around until you explicitly remove it withdocker compose down -vordocker volume rm. - They can be shared between containers. Multiple containers can mount the same volume simultaneously.
- Permissions are usually handled for you. Docker sets up the volume with the right ownership for the container’s process.
Inspecting Named Volumes
Want to know where Docker actually put your stuff?
# List all volumes
docker volume ls
# Get the details
docker volume inspect pgdata
That’ll spit out something like:
[
{
"CreatedAt": "2026-04-01T10:30:00Z",
"Driver": "local",
"Labels": {},
"Mountpoint": "/var/lib/docker/volumes/pgdata/_data",
"Name": "pgdata",
"Options": {},
"Scope": "local"
}
]
The Mountpoint is where the data physically lives on disk. On Linux, you can ls that path directly (with sudo). On Mac and Windows, it’s inside the Docker Desktop VM, so you can’t just browse to it — which is actually a feature, not a bug, because it stops you from accidentally rm -rf-ing your production database while trying to free up disk space.
Anonymous Volumes: The Mysterious Stranger
Anonymous volumes are what you get when you specify a volume mount without giving it a name, or when a Dockerfile has a VOLUME instruction.
# This creates an anonymous volume
docker run -d -v /var/lib/postgresql/data postgres:16
Docker creates a volume with a name like a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6q7r8s9t0u1v2w3x4y5z6. Good luck finding that in your volume list six months from now.
When Anonymous Volumes Make Sense
Honestly? Almost never on purpose. They exist primarily because:
- Some Dockerfiles declare
VOLUMEinstructions, and Docker dutifully creates anonymous volumes when you run them. - You’re doing a quick throwaway test and don’t care about the data.
The problem is that anonymous volumes pile up like junk mail. Every time you docker run a container with an anonymous volume and then remove it, the volume stays behind. Your disk fills up. You run docker system df and discover 47 GB of orphaned volumes. It’s the Docker equivalent of never cleaning out your garage.
# See the damage
docker system df
# Clean up dangling volumes (ones not attached to any container)
docker volume prune
# Nuclear option: remove ALL unused volumes
docker volume prune -a
Pro tip: If a Dockerfile you’re using declares VOLUME and you want to override it with a named volume, just explicitly mount a named volume to that same path. Your named volume wins.
Bind Mounts: The Direct Line
Bind mounts are the oldest and most straightforward approach. You pick a directory on your host, you pick a directory in the container, and Docker makes them the same directory. Whatever changes on one side shows up on the other. It’s like a portal between your host and the container.
# Bind mount your current directory into the container
docker run -d \
-v $(pwd)/app:/usr/src/app \
-w /usr/src/app \
node:20 \
npm start
Or in Compose:
services:
web:
build: .
volumes:
- ./src:/app/src
- ./config/nginx.conf:/etc/nginx/nginx.conf:ro
ports:
- "3000:3000"
Notice the ./ prefix on those paths? That’s the telltale sign of a bind mount in Compose. If it starts with . or /, it’s a bind mount. If it’s just a name like pgdata, it’s a volume.
The Long-Form Syntax
Docker Compose also supports a more explicit syntax that’s easier to read and less ambiguous:
services:
web:
build: .
volumes:
- type: bind
source: ./src
target: /app/src
- type: bind
source: ./config/nginx.conf
target: /etc/nginx/nginx.conf
read_only: true
More verbose, but zero confusion about what’s a volume and what’s a bind mount. Your future self will thank you.
When Bind Mounts Shine
Development workflows. This is their killer feature. You edit code on your host in VS Code, and the changes instantly appear inside the container. No rebuild needed. Pair this with a file watcher like nodemon or a hot-reload framework, and you’ve got a development experience that’s actually pleasant.
services:
dev:
build:
context: .
dockerfile: Dockerfile.dev
volumes:
- ./src:/app/src
- ./package.json:/app/package.json
- /app/node_modules # Anonymous volume trick -- see below
ports:
- "3000:3000"
command: npm run dev
Configuration files. Need to inject a custom nginx.conf or my.cnf without building a whole new image? Bind mount it:
services:
proxy:
image: nginx:alpine
volumes:
- ./nginx/nginx.conf:/etc/nginx/nginx.conf:ro
- ./nginx/certs:/etc/nginx/certs:ro
ports:
- "443:443"
That :ro at the end means read-only. The container can read the file but can’t modify it. Good practice for config files — you don’t want nginx deciding to rewrite its own config at runtime.
The node_modules Trick
If you’ve ever bind-mounted a Node.js project into a container, you’ve probably hit this: the container installs its own node_modules (built for Linux), then your host’s node_modules (built for Mac) overwrites them through the bind mount, and everything explodes.
The fix is a pattern so common it should have its own name:
volumes:
- ./:/app # Bind mount the whole project
- /app/node_modules # Anonymous volume masks host's node_modules
That anonymous volume for /app/node_modules acts like a shield. The container’s node_modules live in the anonymous volume, isolated from the host’s version. It’s a hack, but it’s a hack that works.
The Permission Nightmare
Here’s where bind mounts get spicy. When you bind mount a directory, the container sees the files with the same UID/GID as on the host. If your host user is UID 1000 but the container process runs as UID 999, you get permission denied errors that make you want to flip your desk.
Common symptoms:
- Your app can’t write to the mounted directory
- Log files are owned by
rooton your host after the container writes them - Database containers refuse to start because they can’t write to their data directory
Common fixes:
# In your Dockerfile, create a user with a matching UID
RUN groupadd -g 1000 appuser && \
useradd -u 1000 -g appuser appuser
USER appuser
Or, in your Compose file:
services:
app:
build: .
user: "1000:1000"
volumes:
- ./data:/app/data
Or, the “I give up on permissions” approach (don’t do this in production):
chmod -R 777 ./data # The permissions equivalent of leaving your front door open
Seriously though, UID mapping is one of the most common sources of Docker frustration, and it’s almost always a bind mount issue. Named volumes handle this gracefully because Docker sets the ownership when the volume is first created.
tmpfs Mounts: The Ghost Data
tmpfs mounts store data in the host’s memory (RAM). When the container stops, the data vanishes. It’s like writing on a whiteboard — useful while it lasts, gone the moment someone bumps the eraser.
services:
app:
image: myapp:latest
tmpfs:
- /tmp
- /run
Or with more control:
services:
app:
image: myapp:latest
volumes:
- type: tmpfs
target: /app/cache
tmpfs:
size: 100000000 # 100 MB limit
When to Use tmpfs
- Sensitive data that shouldn’t be written to disk (tokens, session data, temporary secrets)
- Cache directories where speed matters and persistence doesn’t
- Scratch space for processing that doesn’t need to survive a restart
- Security-conscious workloads where you don’t want data lingering on disk after the container exits
tmpfs is also slightly faster than writing to a volume or bind mount because it bypasses the disk entirely. If your app does heavy I/O on temporary files, tmpfs can be a nice performance boost.
The Comparison Table
Let’s put it all together:
| Feature | Named Volume | Anonymous Volume | Bind Mount | tmpfs |
|---|---|---|---|---|
| Data persists after container removal | Yes | Yes (but good luck finding it) | Yes | No |
| Easy to back up | Yes | Not really | Yes | N/A |
| Sharable between containers | Yes | Yes | Yes | No |
| Works in Compose declaratively | Yes | Sort of | Yes | Yes |
| Host path control | No (Docker decides) | No | Yes | N/A |
| Permission handling | Automatic | Automatic | Manual (pain) | Automatic |
| Performance on Mac/Windows | Good (native in VM) | Good | Slower (file sync overhead) | Fast (RAM) |
| Best for | Databases, persistent app data | Masking host directories | Dev workflows, config files | Caches, secrets, scratch space |
Real-World Scenarios
Scenario 1: Production Database
Named volume. Every time. No exceptions.
services:
db:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
environment:
POSTGRES_USER: app
POSTGRES_PASSWORD_FILE: /run/secrets/db_password
POSTGRES_DB: production
secrets:
- db_password
deploy:
resources:
limits:
memory: 2G
volumes:
pgdata:
driver: local
secrets:
db_password:
file: ./secrets/db_password.txt
Why not a bind mount? Because permissions, because portability, because Docker handles the volume lifecycle, and because you really don’t want to accidentally rm -rf your production data directory while trying to clean up your home folder.
Scenario 2: Development Environment
Bind mounts for your code, named volumes for dependencies.
services:
dev:
build:
context: .
target: development
volumes:
- ./src:/app/src # Bind mount: live reload
- ./tests:/app/tests # Bind mount: edit tests locally
- node_modules:/app/node_modules # Named volume: container-native deps
- ./docker/dev.env:/app/.env:ro # Bind mount: config injection
ports:
- "3000:3000"
- "9229:9229" # Debugger port
command: npm run dev
volumes:
node_modules:
This gives you the best of both worlds: live code reloading through bind mounts, with native (container-built) dependencies in a named volume that won’t conflict with your host machine.
Scenario 3: Shared Config Across Services
Bind mount the same config file into multiple containers:
services:
app1:
image: myapp:latest
volumes:
- ./config/shared.yml:/app/config/shared.yml:ro
app2:
image: myotherapp:latest
volumes:
- ./config/shared.yml:/app/config/shared.yml:ro
worker:
image: myworker:latest
volumes:
- ./config/shared.yml:/app/config/shared.yml:ro
One config file, three containers, zero drift. Change the config on the host, restart the containers, done.
Scenario 4: Processing Pipeline with Scratch Space
tmpfs for the intermediate files, named volume for the results:
services:
processor:
image: data-cruncher:latest
volumes:
- type: tmpfs
target: /tmp/workspace
tmpfs:
size: 500000000 # 500 MB scratch space in RAM
- results:/app/output
volumes:
results:
The processor churns through data in RAM, writes final results to the named volume. Fast, clean, no leftover temporary files cluttering your disk.
Backup Strategies
This is the part everyone skips until it’s too late. Don’t be that person.
Backing Up Named Volumes
Docker doesn’t have a built-in docker volume backup command (would it kill them to add one?), but the pattern is well-established:
# Back up a named volume to a tar file
docker run --rm \
-v pgdata:/source:ro \
-v $(pwd)/backups:/backup \
alpine \
tar czf /backup/pgdata-$(date +%Y%m%d-%H%M%S).tar.gz -C /source .
This spins up a tiny Alpine container, mounts the volume read-only, and tars it to your host. Simple, reliable, scriptable.
Restoring Named Volumes
# Create a fresh volume
docker volume create pgdata-restored
# Restore from backup
docker run --rm \
-v pgdata-restored:/target \
-v $(pwd)/backups:/backup:ro \
alpine \
sh -c "cd /target && tar xzf /backup/pgdata-20260401-103000.tar.gz"
Automating Backups with a Sidecar
For production, you probably want automated backups. Here’s a Compose service that backs up your database volume daily:
services:
db:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
db-backup:
image: alpine:latest
volumes:
- pgdata:/source:ro
- ./backups:/backup
entrypoint: /bin/sh
command: >
-c 'while true; do
tar czf /backup/pgdata-$$(date +%Y%m%d-%H%M%S).tar.gz -C /source .;
echo "Backup completed at $$(date)";
sleep 86400;
done'
restart: unless-stopped
volumes:
pgdata:
For databases specifically, you’ll want to use the database’s native dump tool (pg_dump, mysqldump, etc.) instead of raw volume backups — that way you get consistent, point-in-time snapshots instead of potentially corrupted files from backing up a live database.
Backing Up Bind Mounts
This is just… backing up files. Use rsync, use cp, use whatever backup solution you already have. The data is right there on your filesystem. That’s the whole point of bind mounts.
rsync -avz ./config/ /backup/config/
NFS Mounts: Sharing Volumes Across Hosts
When you need multiple Docker hosts to access the same volume — say, in a Swarm cluster or just across a few servers — NFS mounts are your friend. Docker’s volume driver system lets you mount an NFS share as a named volume.
volumes:
shared-data:
driver: local
driver_opts:
type: nfs
o: addr=192.168.1.100,rw,nfsvers=4
device: ":/exports/docker-data"
Then use shared-data in your services just like any other named volume:
services:
web:
image: myapp:latest
volumes:
- shared-data:/app/data
volumes:
shared-data:
driver: local
driver_opts:
type: nfs
o: addr=192.168.1.100,rw,nfsvers=4
device: ":/exports/docker-data"
NFS Gotchas
- Performance. NFS over a network will always be slower than local storage. Don’t put your database on NFS unless you enjoy watching queries crawl.
- Permissions. NFS has its own UID/GID mapping. If the NFS server and your Docker containers don’t agree on who UID 1000 is, you’re in for a bad time. Use
no_root_squashcautiously, or better yet, set up proper UID mapping. - Availability. If the NFS server goes down, every container mounted to it hangs. Make sure your NFS server is more reliable than the things depending on it.
- Security. By default, NFS traffic is unencrypted. If you’re going across untrusted networks, use NFSv4 with Kerberos, or tunnel through a VPN.
Dev vs Prod: A Tale of Two Configs
Here’s the golden rule: bind mounts in development, named volumes in production. Everything else is just details.
In development, you want fast feedback loops. Edit a file, see the change. Bind mounts give you that direct connection between your editor and the container.
In production, you want reliability, portability, and security. Named volumes give you all three. Your production server doesn’t need (and shouldn’t have) your source code bind-mounted into containers. It should be baked into the image.
Here’s a pattern using Compose overrides to manage this cleanly:
docker-compose.yml (base, used by both):
services:
web:
image: myapp:latest
ports:
- "3000:3000"
db:
image: postgres:16
volumes:
- pgdata:/var/lib/postgresql/data
volumes:
pgdata:
docker-compose.override.yml (automatically loaded in dev):
services:
web:
build: .
volumes:
- ./src:/app/src
- ./package.json:/app/package.json
environment:
NODE_ENV: development
command: npm run dev
docker-compose.prod.yml (explicitly loaded in production):
services:
web:
image: registry.example.com/myapp:${VERSION}
environment:
NODE_ENV: production
deploy:
replicas: 3
resources:
limits:
memory: 512M
In dev, docker compose up automatically picks up both the base file and the override. In production:
docker compose -f docker-compose.yml -f docker-compose.prod.yml up -d
Clean separation. Same services. Different storage strategies.
Common Mistakes (and How to Avoid Them)
Mistake 1: Using docker compose down -v When You Didn’t Mean To
The -v flag removes named volumes. One stray flag and your database is toast. If you’re in the habit of typing docker compose down -v to “clean up,” stop. Use docker compose down (no -v) as your default, and only add -v when you genuinely want to nuke the data.
Mistake 2: Bind Mounting Over Important Directories
If you bind mount an empty host directory into a container path that has existing files, the host directory wins. The container’s files at that path just… disappear. This is a great way to accidentally empty out /etc/nginx/conf.d/ and spend an hour wondering why nginx won’t start.
Mistake 3: Ignoring Volume Driver Options
Named volumes aren’t limited to local storage. You can use volume drivers for cloud storage, distributed filesystems, and encrypted volumes. But the default local driver is fine for single-host setups.
Mistake 4: Not Labeling Your Volumes
In Compose, you can add labels to volumes for easier management:
volumes:
pgdata:
labels:
com.example.project: "myapp"
com.example.environment: "production"
com.example.backup: "daily"
Future you will appreciate being able to filter volumes by label when you have forty of them and can’t remember which project a1b2c3d4_pgdata belongs to.
Mistake 5: Running Databases on Bind Mounts in Production
Just don’t. Named volumes exist for a reason. Bind mounts add permission complexity, path dependency, and one more thing to get wrong. For databases in production, named volumes are the way.
The TL;DR Decision Tree
Still not sure what to use? Here’s the cheat sheet:
- “I need to persist database data” — Named volume
- “I want live code reloading in dev” — Bind mount
- “I need to inject a config file” — Bind mount (read-only)
- “I have temporary data that shouldn’t hit disk” — tmpfs
- “I need to share data across Docker hosts” — Named volume with NFS driver
- “I need to back up container data” — Named volume (then tar it out)
- “I’m not sure” — Named volume. When in doubt, named volume.
Docker’s storage model is one of those things that’s simple on the surface but has enough depth to keep you learning for years. The good news is that for 90% of use cases, you only need to remember two things: named volumes for persistence, bind mounts for development. Everything else is just seasoning.
Now go forth and stop losing your data.