GNU parallel for Embarrassingly Parallel Tasks

The Problem: Slow Loops

You have 1000 images. You want to compress them. A simple loop:

for img in *.jpg; do
  convert "$img" -quality 80 "compressed_$img"
done

Runs one image at a time. On a 4-core system, 3 cores sit idle the whole time.

With GNU parallel, all cores work. Processing time drops from 60 minutes to 15.

Installation

On Linux:

sudo apt install parallel    # Debian/Ubuntu
sudo yum install parallel    # CentOS/RHEL
brew install parallel        # macOS

Or compile from source:

wget https://ftp.gnu.org/gnu/parallel/parallel-latest.tar.bz2
tar xjf parallel-latest.tar.bz2
cd parallel-*
./configure && make && sudo make install

Basic Usage

Compress those images:

ls *.jpg | parallel convert {} -quality 80 "compressed_{}"

{} is a placeholder for each input item. Parallel distributes the work across all CPU cores.

By default, it uses as many jobs as you have cores. Force a specific number:

ls *.jpg | parallel -j 2 convert {} -quality 80 "compressed_{}"

-j 2 runs only 2 jobs at a time (useful if each job is memory-hungry).

Real-World Examples

Download Files in Parallel

cat urls.txt | parallel -j 4 wget -q {}

Downloads 4 files at once instead of one-by-one. Dramatically faster.

Compress Files

find . -name "*.log" | parallel gzip {}

Compress all logs in parallel. Way faster than a for loop.

Process CSV Rows

# Each row is a user, run a script on each in parallel
tail -n +2 users.csv | parallel -j 8 process_user.sh {}

tail -n +2 skips the header. Process 8 rows at a time.

Run a Function on Each Item

#!/bin/bash

process_item() {
  local id=$1
  echo "Processing $id..."
  sleep 1  # Fake work
  echo "Done $id"
}

export -f process_item

seq 1 10 | parallel process_item {}

export -f makes the function available to parallel. Now each worker runs it.

Controlling Workers

Run on All Cores

parallel -j 0 command {}  # Uses all cores + hyperthreading

-j 0 is maximum parallelism.

Run on N Cores

parallel -j 4 command {}

Lock to 4 parallel jobs.

Use a Fraction of Cores

parallel -j 50% command {}

Run on 50% of available cores. Leaves room for other work.

Run on a Remote Machine

parallel -S user@remote.example.com command {}

Parallel can distribute work across multiple machines. Useful for clusters.

Quoting and Escaping

Tricky input? Use --quote:

echo 'file with spaces.txt' | parallel --quote echo {}

Or quote manually:

echo 'file with spaces.txt' | parallel echo "{}"

The double quotes protect the argument.

Named Placeholders

Instead of {}, use custom names:

cat list.txt | parallel --col-sep=, --pipe \
  bash -c 'id=$1; name=$2; echo "ID: $id, Name: $name"' _ {} {2}

Actually, that’s getting complex. Better approach:

parallel echo "User: {2} (ID: {1})" ::: 123 alice 456 bob

::: separates input groups. {1} is the first, {2} is the second.

Handling Errors

By default, parallel continues even if a job fails. See errors:

parallel --joblog /tmp/job.log command {} < input.txt

--joblog writes results to a file. Check it:

cat /tmp/job.log

Output:

Seq  Host  Starttime  JobRuntime  Send  Receive  Exitval  Signal  Command
1  :  1618590123  1.234  0  100  0  0  command input1
2  :  1618590124  1.567  0  50  1  0  command input2

Exitval is the exit code. 1 means it failed.

Stop on first failure:

parallel -x command {} < input.txt

-x exits if any job fails.

Progress and Monitoring

Show progress bar:

parallel --eta command {} < input.txt

--eta shows estimated time remaining.

Or use --progress:

parallel --progress command {} < input.txt

Shows how many jobs are done.

Real Example: Batch Image Processing

You have 10,000 photos. Resize and compress them:

#!/bin/bash

WORKERS=4
QUALITY=80
SIZE="1920x1080"

find ./photos -type f \( -name "*.jpg" -o -name "*.png" \) |
  parallel -j $WORKERS \
    convert {} \
      -resize "$SIZE^" \
      -quality $QUALITY \
      -gravity center \
      -extent "$SIZE" \
      "./output/{/}"

-j $WORKERS: Use 4 cores
{}: Input filename
{/}: Just the basename (strip directory path)
./output/{/}: Output to a different directory

On a 4-core system, this is 4x faster than a loop.

Real Example: Database Backups

Backup multiple databases in parallel:

databases=("db1" "db2" "db3" "db4")

printf '%s\n' "${databases[@]}" |
  parallel -j 2 \
    mysqldump {} | gzip > "backup_{}.sql.gz"

Backs up 2 databases at a time. Faster than serial backups.

Alternatives

GNU parallel is powerful but has a learning curve.

xargs is simpler for basic cases:

find . -name "*.log" | xargs -P 4 gzip

-P 4 runs 4 processes in parallel. Works for simple tasks.

Rust’s rayon (if you’re writing Rust):

use rayon::prelude::*;

files.par_iter().for_each(|file| {
  process(file);
});

Python’s multiprocessing:

from multiprocessing import Pool

with Pool(4) as p:
  p.map(process, files)

For shell scripts though, GNU parallel is the go-to.

Bottom Line

GNU parallel turns a slow for-loop into a blazing-fast parallel job distributor. If you’re processing hundreds or thousands of items on a multi-core system, 10 minutes learning parallel buys you hours of compute time back.

Just remember: parallel isn’t magic. It can’t make one job faster. It can only run multiple jobs at once. Pick it when you have embarrassingly parallel work—tasks that don’t depend on each other.