Skip to content
SumGuy's Ramblings
Go back

Open Source Security: Scanning Your Dependencies Before They Scan You

The Package You Didn’t Know You Depended On Is the One That’ll Hurt You

April 2024: A researcher named Andres Freund noticed sshd was taking a few milliseconds longer than expected to authenticate. That tiny slowness led him to discover that xz-utils — a compression library present in almost every Linux system — had been backdoored by a malicious contributor named “Jia Tan” who spent two years building trust in the project before inserting a payload that would have allowed remote code execution as root.

Two years of patient work. One library. Potentially every Linux server on the internet.

This isn’t theoretical. This is the threat model you’re operating in when you npm install or pip install or go get without thinking about what you’re pulling in.

What’s in Your Dependency Tree

Run this on any Node.js project and prepare for an existential moment:

npm ls --all 2>/dev/null | wc -l

That number — often in the hundreds or thousands — is every package your project depends on, transitively. You explicitly installed maybe 20. The rest came along for the ride.

The same is true for Python, Go, Rust, and every other ecosystem. You’re not just trusting the packages you chose — you’re trusting their authors’ dependency choices, and their dependencies’ authors’ choices, all the way down.

SBOM (Software Bill of Materials) is the formal answer to “what is in my software.” It’s a machine-readable list of every component, version, and license. Like a nutritional label for software.

syft: Generating SBOMs

Syft scans an image or directory and produces a complete SBOM in multiple formats:

# Install
curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin

# Scan a Docker image
syft ubuntu:22.04

# Scan a directory
syft /path/to/your/project

# Output formats
syft ubuntu:22.04 -o json > sbom.json
syft ubuntu:22.04 -o spdx-json > sbom-spdx.json
syft ubuntu:22.04 -o cyclonedx-json > sbom-cdx.json

SPDX and CycloneDX are the two dominant SBOM standards. For most purposes, CycloneDX JSON is the most useful — it’s what downstream tools expect.

For a Go project:

syft . -o cyclonedx-json > sbom.json

Syft detects packages from: go.sum, package-lock.json, requirements.txt, Pipfile.lock, Gemfile.lock, composer.lock, Cargo.lock, and more. It reads your actual lock files, not just what you declared.

grype: Vulnerability Scanning Against SBOMs

Grype takes an SBOM (or scans directly) and checks everything against vulnerability databases:

# Install
curl -sSfL https://raw.githubusercontent.com/anchore/grype/main/install.sh | sh -s -- -b /usr/local/bin

# Scan directly
grype ubuntu:22.04

# Scan from SBOM
grype sbom:./sbom.json

# Show only HIGH and CRITICAL
grype ubuntu:22.04 --fail-on high

# Output as table (default), JSON, or template
grype ubuntu:22.04 -o json | jq '.matches[] | select(.vulnerability.severity == "Critical")'

Grype checks against NVD, GitHub Security Advisories, Ubuntu Security Notices, Red Hat, and others. It correlates package versions against known CVEs and shows you:

Use --fail-on critical in CI to block deployments with critical vulnerabilities.

trivy: The Swiss Army Knife for Container Security

Trivy from Aqua Security does everything: container scanning, filesystem scanning, git repo scanning, IaC scanning (Terraform, Kubernetes manifests):

# Install
wget https://github.com/aquasecurity/trivy/releases/latest/download/trivy_Linux-64bit.tar.gz
tar -xzf trivy_Linux-64bit.tar.gz
sudo mv trivy /usr/local/bin/

# Scan a container image
trivy image nginx:latest

# Scan a local directory
trivy fs /path/to/project

# Scan a git repo (checks secrets too)
trivy repo https://github.com/someone/repo

# Scan Kubernetes configs
trivy config ./kubernetes/

# JSON output
trivy image --format json -o results.json nginx:latest

Trivy is what you reach for when you want one tool that handles most of your security scanning needs. It’s fast, regularly updated, and the output is readable.

For containers, it also scans for:

The secrets scanning is genuinely useful — developers accidentally bake credentials into Docker images more often than you’d think.

osv-scanner: Google’s Take on Dependency Scanning

osv-scanner uses the OSV (Open Source Vulnerability) database maintained by Google, which aggregates from multiple sources including GitHub Advisories, OSS-Fuzz, and CISA KEV:

# Install
go install github.com/google/osv-scanner/cmd/osv-scanner@latest

# Scan a project
osv-scanner --recursive /path/to/project

# Scan a lockfile specifically
osv-scanner --lockfile package-lock.json

# Output JSON
osv-scanner --json --recursive . > osv-results.json

OSV data is public and available at osv.dev. It tends to have faster response times for newly discovered vulnerabilities in open source packages than NVD.

GitHub Dependabot: Automated PRs for Free

If your project is on GitHub, Dependabot gives you automated dependency update PRs with zero setup:

.github/dependabot.yml:

version: 2
updates:
  - package-ecosystem: "npm"
    directory: "/"
    schedule:
      interval: "weekly"
    labels:
      - "dependencies"
    ignore:
      - dependency-name: "some-legacy-dep"
        versions: [">= 2.0.0"]  # Don't auto-update major versions

  - package-ecosystem: "docker"
    directory: "/"
    schedule:
      interval: "weekly"

  - package-ecosystem: "github-actions"
    directory: "/"
    schedule:
      interval: "weekly"

Dependabot opens PRs that update dependency versions. CI runs against them. You review and merge. Your dependencies stay fresh without manually tracking changelogs.

Enable security alerts separately in GitHub repo Settings → Security → Dependabot alerts. That notifies you immediately when a CVE is published for something you depend on.

Integrating into CI/CD

Here’s a GitHub Actions workflow that scans on every push and blocks PRs with critical vulnerabilities:

name: Security Scan

on:
  push:
    branches: [main, develop]
  pull_request:

jobs:
  scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Generate SBOM
        uses: anchore/sbom-action@v0
        with:
          artifact-name: sbom.spdx.json

      - name: Vulnerability Scan
        uses: anchore/scan-action@v3
        with:
          image: "."
          fail-build: true
          severity-cutoff: high

      - name: Trivy Container Scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: 'your-image:${{ github.sha }}'
          format: 'sarif'
          output: 'trivy-results.sarif'
          severity: 'CRITICAL,HIGH'

      - name: Upload Trivy results to GitHub Security
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: 'trivy-results.sarif'

The SARIF upload integrates scan results directly into GitHub’s Security tab — you see vulnerabilities as code scanning alerts, right next to your source code.

License Compliance Scanning

Not all open source is “use it in anything” open source. GPL code in your proprietary project is a licensing problem. Syft includes license detection:

syft /path/to/project -o json | jq '.artifacts[] | {name: .name, licenses: .licenses}'

For dedicated license scanning, FOSSA (has a free tier) or licensee for simpler cases.

OpenSSF Scorecard: Project Health Metrics

Scorecard from the Open Source Security Foundation evaluates a project’s security practices:

scorecard --repo github.com/some-project/some-repo

It checks: branch protection, code review practices, dependency pinning, vulnerability disclosure policy, CI/CD practices, and more. Useful for evaluating third-party dependencies before you add them.

The Practical Workflow

You don’t need all of these tools running on every commit. Start here:

  1. Add Dependabot — takes 5 minutes, runs forever, keeps dependencies fresh
  2. Add Trivy to CI — scan your Docker image on every build, fail on Critical
  3. Generate an SBOM as an artifact in your release process
  4. Run osv-scanner locally before any significant dependency update

That covers 90% of your exposure with reasonable maintenance overhead. You can always add more layers as your risk tolerance demands. But running nothing at all is the choice that eventually ends in an incident post-mortem.

The XZ backdoor almost made it through. It was caught by accident, by someone who noticed a few milliseconds of latency. Build the kind of pipeline where accidents like that are less necessary.


Share this post on:

Previous Post
CUDA vs ROCm vs CPU: Running AI on Whatever GPU You've Got
Next Post
HandBrake and Video Transcoding: Your Media Library Deserves Better Compression