Private Docker Registry with Harbor

Docker Hub is fine, until it isn’t

You’re deep in a CI/CD run. Build passes. Pipeline hits the push step. Docker Hub rate-limits you — again. Your deploy is blocked because a free-tier policy decided your IP looked too eager.

Or maybe you’re shipping proprietary software and the idea of your container images living on someone else’s servers makes your security team’s eye twitch. Or your images are 4 GB and pulling them across the internet on every deploy is burning time and money you don’t have.

Whatever brought you here, you’re about to set up a private container registry that doesn’t apologize for existing.

Enter Harbor.

Why not just `registry:2`?

You could spin up Docker’s official registry:2 image in about four minutes. It runs, it stores images, it technically does the job. It also has no UI, no real access control, no vulnerability scanning, and about as much operational visibility as a black box nailed to your server rack.

Harbor is what happens when someone looked at registry:2 and said “this needs to be a real product.” It’s a CNCF graduated project — not a weekend experiment — running in production at companies with actual uptime requirements.

Here’s what Harbor adds:

Feature	`registry:2`	Harbor
Web UI	No	Yes
RBAC	No	Yes
Vulnerability scanning	No	Yes (Trivy)
Image replication	No	Yes
Garbage collection	Manual	Scheduled + UI
Audit logs	No	Yes
Robot accounts	No	Yes
Helm chart repo	No	Yes

The operational difference is roughly the same as between sqlite3 at a terminal and a proper database with a dashboard. One is a tool, one is a platform.

Why self-host a registry at all?

Quick case for the skeptics:

Privacy. Proprietary code, baked-in configs, anything you’d rather not have on public infrastructure — it stays yours.

Performance. Pulling from your own network is dramatically faster than pulling from a remote registry. In a Kubernetes cluster that scales frequently, this compounds.

CI/CD speed. No more rate limits, no more waiting on Docker Hub’s infrastructure during your build rush hour. Your pipeline pulls what it needs, immediately.

Cost. Cloud registries charge for storage, egress, and sometimes per-image. Fixed-cost hardware you already own is just better math.

Compliance. Some industries have regulations about where data lives. “It’s on Docker Hub somewhere” is not an answer that satisfies auditors.

Deploying Harbor with Docker Compose

Harbor ships an official installer that generates a Compose stack for you. Download the offline installer — it bundles everything and doesn’t depend on Docker Hub during setup, which is ironic but practical.

wget https://github.com/goharbor/harbor/releases/download/v2.11.0/harbor-offline-installer-v2.11.0.tgz
tar xzvf harbor-offline-installer-v2.11.0.tgz
cd harbor
cp harbor.yml.tmpl harbor.yml

Edit harbor.yml before running anything:

hostname: registry.yourdomain.com

https:
  port: 443
  certificate: /etc/harbor/certs/registry.yourdomain.com.crt
  private_key: /etc/harbor/certs/registry.yourdomain.com.key

harbor_admin_password: ChangeThisNow

database:
  password: AlsoChangeThis

data_volume: /data/harbor

log:
  level: info
  rotate_count: 50
  rotate_size: 200m
  location: /var/log/harbor

The data_volume is where all your image layers live. Make sure that path has room — images add up fast and disk space surprises are never fun at 3 AM.

Run the installer with Trivy enabled:

sudo ./install.sh --with-trivy

That --with-trivy flag enables the vulnerability scanner. It takes a few minutes to load everything. When it’s done, Harbor is running as a Compose stack. Hit https://registry.yourdomain.com in your browser, log in as admin, and you’re in.

To bring it back up after a reboot:

cd /path/to/harbor
docker compose up -d

Harbor manages its own internal Compose file — you don’t edit it directly.

Configuring Docker to use your registry

If you have a valid TLS cert from a recognized CA, Docker trusts it out of the box:

docker login registry.yourdomain.com
docker tag myapp:latest registry.yourdomain.com/myproject/myapp:latest
docker push registry.yourdomain.com/myproject/myapp:latest

For home lab setups with self-signed certs, install your CA on each Docker host:

sudo mkdir -p /etc/docker/certs.d/registry.yourdomain.com/
sudo cp ca.crt /etc/docker/certs.d/registry.yourdomain.com/
sudo systemctl restart docker

Or, if you truly can’t do TLS (lab-only, never production), tell Docker to trust it as an insecure registry:

{
  "insecure-registries": ["registry.yourdomain.com"]
}

sudo systemctl restart docker

The insecure route works but don’t let that config drift into production machines. Certificate trust is the correct path — it just takes five more minutes up front.

RBAC: projects, users, and robot accounts

Harbor organizes images into projects. A project is a namespace — registry.yourdomain.com/myproject/myapp lives in myproject. Permissions are set per project.

The roles you’ll actually use:

Role	Capabilities
Guest	Pull only
Developer	Push and pull
Maintainer	Push, pull, delete tags
Project Admin	Full project control

For human users, create accounts in the UI (or wire up LDAP/OIDC for enterprise auth). For CI/CD pipelines, use robot accounts — they’re purpose-built for automation and don’t depend on any human’s credentials.

Create a robot account: Project → Robot Accounts → New Robot Account. Name it, set an expiry, choose permissions. You get back a username and a token you’ll only see once — store it in your CI secret manager immediately.

# CI/CD login with a robot account
# Note the dollar sign — it's part of the username format
docker login registry.yourdomain.com \
  --username 'robot$myproject+ci-runner' \
  --password "$HARBOR_ROBOT_TOKEN"

That dollar sign in the username trips people up constantly in shell scripts and YAML. Single-quote the username whenever possible.

Scope robot accounts to exactly what they need. Your build pipeline needs push access to one project — give it that. Nothing more.

Vulnerability scanning with Trivy

If you installed with --with-trivy, Harbor scans images on push. You’ll see vulnerability reports in the UI per tag — Critical, High, Medium, Low — with CVE IDs, affected packages, and whether a fix is available.

Enable automatic scanning: Project → Configuration → “Automatically scan images on push.” Toggle it on.

To block pulls of vulnerable images: Project → Configuration → “Prevent vulnerable images from running,” set a threshold. Set it to Critical at minimum — anything with a known critical CVE shouldn’t be deployable without a conscious decision to override it.

You can also trigger scans via the API, which is useful for gating CI:

curl -u "robot\$myproject+ci-runner:$HARBOR_TOKEN" \
  -X POST \
  "https://registry.yourdomain.com/api/v2.0/projects/myproject/repositories/myapp/artifacts/latest/scan"

Even just having the scan data visible is valuable without enforcement. You push an image, check Harbor, see “12 medium CVEs in the base OS layer” — that’s actionable next sprint. The data is there. Use it.

Image replication

Harbor can sync images between registries, which is useful for:

Caching Docker Hub images locally to eliminate rate limits
Syncing staging to prod as part of a release process
Multi-site DR between Harbor instances

Set up a replication endpoint: Administration → Registries → New Endpoint. Add Docker Hub as a source. Then create a replication rule — pull-based (Harbor fetches on schedule) or push-based (Harbor pushes on trigger).

A practical setup: replicate your base images (ubuntu, python, node, alpine) from Docker Hub into Harbor on a nightly schedule. Your builds pull from Harbor. Docker Hub’s rate limits become someone else’s problem.

Name: dockerhub-base-cache
Source: docker.io
Filter: library/ubuntu, library/python, library/node
Destination: /base-images/
Trigger: Scheduled — 0 2 * * *

For staging-to-prod replication, use event-based triggers — push to staging, Harbor automatically mirrors to prod registry. Your prod deploy just pulls what’s already local.

The practical workflow

Here’s the end-to-end flow in a real pipeline:

build:
  stage: build
  script:
    - docker login registry.yourdomain.com
        -u 'robot$myapp+ci' -p "$HARBOR_TOKEN"
    - docker build -t registry.yourdomain.com/myapp/api:$CI_COMMIT_SHA .
    - docker push registry.yourdomain.com/myapp/api:$CI_COMMIT_SHA
    - docker tag registry.yourdomain.com/myapp/api:$CI_COMMIT_SHA
        registry.yourdomain.com/myapp/api:latest
    - docker push registry.yourdomain.com/myapp/api:latest

deploy:
  stage: deploy
  script:
    - ssh deploy@prod
        "docker pull registry.yourdomain.com/myapp/api:$CI_COMMIT_SHA"
    - ssh deploy@prod "docker compose up -d"

Your prod server pulls from your registry — no Docker Hub in the loop, no human credentials on that machine (robot account, scoped to pull-only), and every image was scanned before it landed there.

Build → Harbor (scanned on push) → CI gate (check scan results) → prod pull → deploy. That’s the loop. Once it’s running, it mostly hums along without needing attention.

Common gotchas

Certificate trust on every Docker host. This is the one people forget. Harbor works on your laptop, works in CI, fails mysteriously on the prod server because you didn’t install the cert there. Automate cert distribution as part of your infra provisioning — Ansible, cloud-init, whatever you’re using. One less thing to debug at deployment time.

Disk space is now your responsibility. Harbor doesn’t delete old image layers automatically. Deleted tags free the manifest, not the blobs — blobs wait for garbage collection. Configure GC: Administration → Clean Up. Schedule it weekly during off-hours. Also set tag retention rules per project — keep the last 10 tags, drop the rest. Your disk will thank you.

Garbage collection takes Harbor briefly read-only. While GC runs, pushes are blocked. Not long, but plan for it. Run it during a maintenance window or at 2 AM when your pipeline isn’t active.

Robot account tokens expire. If you set an expiry (you should), set a calendar reminder when you create the account. A pipeline that suddenly can’t push because the robot token expired is a fun incident to explain at standup.

Replication lag for new base images. If Harbor is your pull-through cache for Docker Hub, a brand-new base image won’t be in Harbor yet on first use. The first build after updating a base image will still hit Docker Hub. Warm the cache as part of your base image update process — pull it manually to Harbor, then your builds pick it up locally.

robot$ username handling in shells. The dollar sign is literal. Single-quote it in bash, escape it in YAML, handle it carefully in environment variable interpolation. Every environment handles this slightly differently and it will catch you at least once.

Is it worth it?

If you’re running more than one or two services in any kind of production-adjacent context, yes. The combination of actual access control, built-in scanning, and not being at Docker Hub’s mercy is worth the few hours to set it up.

Harbor runs comfortably on modest hardware — 2 vCPUs, 4 GB RAM, and whatever disk your images need. It’s not a resource hog. Once it’s running, it mostly gets out of your way.

Your future self — the one who’s not debugging a failed deploy because Docker Hub decided your pull count was suspicious — will appreciate it.

Private Docker Registry with Harbor

Docker Hub is fine, until it isn’t

Why not just `registry:2`?

Why self-host a registry at all?

Deploying Harbor with Docker Compose

Configuring Docker to use your registry

RBAC: projects, users, and robot accounts

Vulnerability scanning with Trivy

Image replication

The practical workflow

Common gotchas

Is it worth it?

Responses from around the web

Discussion

Related Posts

Trivy vs Grype vs Docker Scout

Authentik vs Authelia: SSO for Your Self-Hosted Stack

Container Security: Scan and Sign Your Images Like You Mean It

Trivy + Cosign: Scan and Sign Your Images

Private Docker Registry with Harbor

Docker Hub is fine, until it isn’t

Why not just registry:2?

Why self-host a registry at all?

Deploying Harbor with Docker Compose

Configuring Docker to use your registry

RBAC: projects, users, and robot accounts

Vulnerability scanning with Trivy

Image replication

The practical workflow

Common gotchas

Is it worth it?

Responses from around the web

Discussion

Related Posts

Trivy vs Grype vs Docker Scout

Authentik vs Authelia: SSO for Your Self-Hosted Stack

Container Security: Scan and Sign Your Images Like You Mean It

Trivy + Cosign: Scan and Sign Your Images

Why not just `registry:2`?