Skip to content
Go back

Word Splitting: The Bash Gotcha That Corrupts Filenames

By SumGuy 5 min read
Word Splitting: The Bash Gotcha That Corrupts Filenames

Your bash script works fine in testing. Then someone names a file “my file.txt” and everything breaks. The loop doesn’t iterate properly. The file operation runs twice. You’re tearing your hair out wondering why.

Word splitting is the culprit. It’s bash’s default behavior when you don’t quote variables. I’ve seen production scripts fail because of this. Here’s the thing: it’s avoidable with one habit: always quote your variables.

What Is Word Splitting?

When you use an unquoted variable, bash splits it on whitespace (and other characters defined in IFS — the Internal Field Separator):

word_split.sh
#!/bin/bash
file="my file.txt"
echo $file # Word splitting happens here
Terminal window
$ bash word_split.sh
my file.txt

Wait, that looks fine. Here’s where it breaks:

word_split_broken.sh
#!/bin/bash
file="my file.txt"
ls -l $file # Word splitting happens
Terminal window
$ bash word_split_broken.sh
ls: cannot access 'my': No such file or directory
ls: cannot access 'file.txt': No such file or directory

Bash split the variable on the space and passed two arguments to ls: my and file.txt. But the actual file is my file.txt.

The fix is quoting:

word_split_fixed.sh
#!/bin/bash
file="my file.txt"
ls -l "$file" # Quoted, no splitting

Now it works.

Why This Happens: IFS

The IFS variable defines word splitting characters:

Terminal window
$ echo "$IFS" | od -c
\t \n

That’s space, tab, newline. By default, unquoted variables split on any of these.

You can change IFS:

custom_ifs.sh
#!/bin/bash
IFS=':'
result="one:two:three"
echo $result # Still splits, but only on colons

But even then, the default behavior catches everyone. The fix is simple: quote.

The Loop Disaster

Unquoted variables in loops are particularly dangerous:

loop_broken.sh
#!/bin/bash
files="file1.txt file2.txt file3.txt"
# WRONG
for file in $files; do
echo "Processing: $file"
done
Terminal window
$ bash loop_broken.sh
Processing: file1.txt
Processing: file2.txt
Processing: file3.txt

Okay, that worked. Now with spaces:

loop_broken2.sh
#!/bin/bash
files="my file1.txt my file2.txt my file3.txt"
# WRONG
for file in $files; do
echo "Processing: $file"
done
Terminal window
$ bash loop_broken2.sh
Processing: my
Processing: file1.txt
Processing: my
Processing: file2.txt
Processing: my
Processing: file3.txt

Chaos. The loop iterates 6 times instead of 3.

The solution is to use arrays:

loop_fixed.sh
#!/bin/bash
files=("my file1.txt" "my file2.txt" "my file3.txt")
# RIGHT
for file in "${files[@]}"; do
echo "Processing: $file"
done
Terminal window
$ bash loop_fixed.sh
Processing: my file1.txt
Processing: my file2.txt
Processing: my file3.txt

Or if you’re reading from a command:

loop_mapfile.sh
#!/bin/bash
# Using mapfile to read lines safely
mapfile -t files < <(find . -name "*.txt")
for file in "${files[@]}"; do
echo "Processing: $file"
done

This reads into an array, one line per element, with newlines stripped.

The Glob Expansion Problem

Word splitting and glob expansion are different, but related:

glob.sh
#!/bin/bash
pattern="*.txt"
# WRONG
for file in $pattern; do
echo "File: $file"
done

If you have files a.txt and b.txt, this echoes:

File: a.txt
File: b.txt

But if no .txt files exist:

File: *.txt

You get the literal string. With quoting:

glob_fixed.sh
#!/bin/bash
pattern="*.txt"
# RIGHT
for file in $pattern; do # Unquoted here allows glob expansion
echo "File: $file"
done

If no files match, $pattern is untouched and glob expansion fails (which is what you want). It’s subtle, but you need glob expansion for * to work, so don’t quote the glob. Just quote everything else.

The Rule: Always Quote Variables

Here’s the pattern:

quoting_rules.sh
#!/bin/bash
myvar="some value"
# ALWAYS quote when expanding variables:
echo "$myvar"
ls -l "$myvar"
grep "something" "$myvar"
if [ "$myvar" = "test" ]; then
echo "Matched"
fi
# Exception: glob patterns (let them expand)
for file in *.txt; do
echo "$file"
done
# Exception: word splitting is intentional
IFS=':' read -r -a parts <<< "$PATH"
for part in "${parts[@]}"; do
echo "$part"
done

Rule of thumb: Quote every variable expansion unless you explicitly want word splitting.

Real-World Example: File Processing Script

process_files.sh
#!/bin/bash
set -euo pipefail
directory="$1"
# WRONG - breaks with spaces and special chars
for file in $directory/*; do
echo "Processing: $file"
done
# RIGHT - quoted correctly
for file in "$directory"/*; do
echo "Processing: $file"
done
# Even better - explicit array
mapfile -t files < <(find "$directory" -type f)
for file in "${files[@]}"; do
echo "Processing: $file"
done

Check for Unquoted Variables

Find potential word-splitting issues:

Terminal window
$ grep -n '\$[A-Za-z_][A-Za-z0-9_]*[^"]' your_script.sh | head -10

This finds unquoted variable expansions. Review each one. Quote if it’s not an intentional glob.

The One Thing

Next time you write a loop:

Terminal window
# NEVER
for item in $variable; do
...
done
# ALWAYS
for item in "$variable"; do
...
done
# Or use arrays
for item in "${array[@]}"; do
...
done

That habit will save you from the 3 AM “why is the script running things twice” call. Quote variables. Always.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it may appear here.


Previous Post
Traefik vs Nginx Proxy Manager: Reverse Proxies for Humans
Next Post
Proxmox vs XCP-ng: Hypervisors for People Who Like Their Data Center at Home

Related Posts