Skip to content
Go back

Docker Healthcheck Patterns That Actually Work

By SumGuy 5 min read
Docker Healthcheck Patterns That Actually Work

Your container’s serving requests, but the health check thinks it’s dead. So Docker kills it. Then it restarts. Then it checks again. Restart. Repeat.

Health checks matter. Let’s do them right.

The HEALTHCHECK Instruction

Every Dockerfile can define a health check:

HEALTHCHECK --interval=30s --timeout=3s --start-period=5s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1

What each flag does:

After --retries failures, Docker marks the container as unhealthy. Note: Docker doesn’t automatically kill unhealthy containers. Orchestrators (Swarm, Kubernetes) do that. For single containers, unhealthy just means “the status says so.”

Check Types: What Works

curl (Most Common)

HEALTHCHECK --interval=10s --timeout=2s --retries=2 \
CMD curl -f http://localhost:8080/health || exit 1

The -f flag exits nonzero if HTTP status is >= 400. Simple and effective.

Gotcha: curl must be installed in the image. Alpine images usually have it, but lightweight Python images don’t.

wget (Lightweight)

HEALTHCHECK --interval=10s --timeout=2s --retries=2 \
CMD wget --quiet --tries=1 --spider http://localhost:8080/health || exit 1

Exists in most lightweight images. --spider doesn’t download the body, just checks the status.

native tools (Best)

If your app has a built-in health endpoint in the binary, use it:

HEALTHCHECK --interval=10s --timeout=2s --retries=2 \
CMD ["/app/server", "health-check"]

Or if your language has a standard check tool:

# Python
HEALTHCHECK --interval=10s --timeout=2s --retries=2 \
CMD python -c "import urllib.request; urllib.request.urlopen('http://localhost:8080/health').getcode() == 200"
# Node.js
HEALTHCHECK --interval=10s --timeout=2s --retries=2 \
CMD node -e "require('http').get('http://localhost:8080/health', (r) => {if (r.statusCode !== 200) throw new Error(r.statusCode)})"

nc (netcat) for TCP

Just checking if a port is open:

HEALTHCHECK --interval=10s --timeout=2s --retries=2 \
CMD nc -z localhost 8080 || exit 1

Works but doesn’t verify the app actually works, just that the port is listening.

Good vs Bad Health Checks

Bad: Checking too much

# Don't do this
HEALTHCHECK CMD curl http://localhost/users && curl http://localhost/products && curl http://localhost/orders

If one endpoint is slow, the whole check times out and the container crashes. Overkill.

Good: Simple and focused

HEALTHCHECK --interval=10s --timeout=2s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1

One lightweight endpoint that returns quickly.

Bad: Checking external dependencies

# Don't do this
HEALTHCHECK CMD curl http://external-api.example.com/status || exit 1

If the external service is down, your container gets killed. That’s not a health issue, that’s a dependency issue.

Good: Check yourself

HEALTHCHECK --interval=10s --timeout=2s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1

The /health endpoint can check internal state (database connection, queues, etc.) without external calls.

Tuning Parameters

Interval: How often to check?

Default 30s is usually fine.

Timeout: How long to wait?

Default 3s works for most HTTP endpoints.

Start-period: Grace period on startup

This is critical. If checks start before the app is ready, you’ll see false failures.

# Java app with slow startup
HEALTHCHECK --interval=10s --timeout=5s --start-period=60s --retries=3 \
CMD curl -f http://localhost:8080/health || exit 1

Retries: How many failures trigger unhealthy?

Using Health Status in Compose

depends_on with health checks:

docker-compose.yml
version: '3.8'
services:
db:
image: postgres:latest
healthcheck:
test: ["CMD", "pg_isready", "-U", "postgres"]
interval: 10s
timeout: 5s
retries: 5
api:
image: myapi:latest
depends_on:
db:
condition: service_healthy

Now the API won’t start until Postgres is actually ready, not just running.

Checking Health Status

Terminal window
docker ps --format='table {{.Names}}\t{{.Status}}'
# NAMES STATUS
# api Up 2 minutes (healthy)
# db Up 2 minutes (unhealthy)
docker inspect mycontainer | jq '.State.Health'
# {
# "Status": "healthy",
# "FailingStreak": 0,
# "Runs": [
# {
# "Start": "2025-02-26T15:30:00.123Z",
# "End": "2025-02-26T15:30:01.456Z",
# "ExitCode": 0,
# "Output": ""
# }
# ]
# }

A Real Example: Node.js API

FROM node:20-alpine
WORKDIR /app
COPY package*.json ./
RUN npm ci --only=production
COPY . .
# Health endpoint built into the app
HEALTHCHECK --interval=15s --timeout=3s --start-period=10s --retries=3 \
CMD curl -f http://localhost:3000/health || exit 1
EXPOSE 3000
CMD ["node", "server.js"]

In your Node app, include a simple health endpoint:

app.get('/health', (req, res) => {
res.status(200).json({ status: 'ok' });
});

In docker-compose:

docker-compose.yml
services:
api:
build: .
depends_on:
redis:
condition: service_healthy
environment:
REDIS_HOST: redis
redis:
image: redis:alpine
healthcheck:
test: ["CMD", "redis-cli", "ping"]
interval: 10s
timeout: 2s
retries: 3

Common Mistakes

No start-period: Container fails health checks while booting. Add --start-period.

Timeout too short: Check always times out. Increase --timeout.

Checking external services: Container dies when your ISP hiccups. Check yourself only.

Not implementing an endpoint: If your app doesn’t have /health, add one. It’s 3 lines of code.

Health checks are your first line of defense. Get them right, and you won’t wake up to a page at 2 AM because Docker killed your container for no reason.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it may appear here.


Previous Post
/proc as a Debugging Tool
Next Post
zram vs Swap: What's Actually Faster for Low-RAM Servers

Related Posts