Why Your SSH Connection Keeps Dropping

You SSH into a server. Step away for 10 minutes. Come back to a dead connection.

Connection closed by remote host.
Connection to prod.example.com closed.

Your SSH didn’t crash. Your connection did. Why? And how do you fix it?

The Problem: Idle Connection Timeouts

Most firewalls and NAT gateways kill idle connections after 15-30 minutes. They’re being conservative—preventing resource leaks.

SSH is chatty when you’re typing. But when you’re idle? Silent. The firewall sees nothing, assumes the connection is dead, and closes it.

Solution 1: Client-Side (Your Machine)

Tell your SSH client to send keepalive packets:

Host *
    ServerAliveInterval 60
    ServerAliveCountMax 10

ServerAliveInterval 60: Every 60 seconds, send a keepalive packet to the server.

ServerAliveCountMax 10: If the server doesn’t respond to 10 keepalive packets (600 seconds of silence), give up and close the connection.

Now your connection stays alive:

$ ssh prod
# After 60 seconds of inactivity, SSH sends a keepalive
# Server responds. Connection stays alive.
# Repeat every 60 seconds forever.

Test it:

$ ssh prod
(sit and wait 5 minutes)
# Still connected!

Solution 2: Server-Side (Remote)

If you can’t control the client, configure the server:

ClientAliveInterval 60
ClientAliveCountMax 10

Same idea, but the server sends keepalive packets.

Restart SSH:

sudo systemctl restart ssh

Now all clients connecting to this server stay alive.

Which One to Use?

Client-side (recommended): You control your machine. One config change affects all servers you connect to.

Server-side: If you manage the server and can’t trust clients to configure themselves.

Both: Belt and suspenders. Redundant, but works everywhere.

TCPKeepAlive: The Confusing Option

There’s another option:

TCPKeepAlive yes

This uses the OS-level TCP keepalive, not SSH’s. It’s usually too slow (hours between probes on most systems).

Leave it alone. Use ServerAliveInterval instead.

Real-World Example

Host *
    # Keepalive every 60 seconds
    ServerAliveInterval 60
    # Give up after 10 minutes of no response
    ServerAliveCountMax 10

Host bastion
    # Bastion is critical; check more frequently
    ServerAliveInterval 30
    ServerAliveCountMax 20

Host prod-*
    # Prod is stable; less aggressive
    ServerAliveInterval 120
    ServerAliveCountMax 5

Debugging: See What’s Happening

Enable verbose mode:

$ ssh -v prod
...
debug1: Sending SSH2_MSG_GLOBAL_REQUEST "keepalive@openssh.com" message
...

You’ll see keepalive packets being sent.

Common Timeout Culprits

1. NAT Gateway (Your WiFi/Router)

Closes idle connections after 15-30 minutes. Keepalive fixes it by sending packets before timeout.

2. Firewall (Corporate/VPN)

Can be aggressive. Some drop connections after 5 minutes of inactivity.

# If 5-min timeout is your pattern, try:
ServerAliveInterval 240  # Every 4 minutes

3. Bastion/Jump Host

Jump hosts sometimes close idle tunnels. Fix:

Host bastion
    ServerAliveInterval 30

Host *.internal
    ProxyJump bastion
    ServerAliveInterval 30

4. Slow/Congested Links

Keepalive packets might get delayed. Increase the timeout:

ServerAliveInterval 30
ServerAliveCountMax 20  # 10 minutes before giving up

Server-Side Debugging

Check what’s configured:

$ grep -i clientalive /etc/ssh/sshd_config
ClientAliveInterval 60
ClientAliveCountMax 10

See active SSH connections:

$ w
USER TTYP FROM LOGIN@ IDLE WHAT
admin pts/0 client.ip Fri10 4:30 /bin/bash

# 4:30 idle = 4 hours 30 minutes!
# Server is keeping them alive despite inactivity

Comparing Options

Option	Who?	Granularity	Reliability
ServerAliveInterval	Client	Per-host in ~/.ssh/config	Very (SSH level)
ClientAliveInterval	Server	Global sshd_config	Very (SSH level)
TCPKeepAlive	OS	System-wide	Poor (often hours)

Best: Client-side ServerAliveInterval. You control it, and it works everywhere.

Pro Tip: Specific Hosts

Different servers need different tactics:

# Home server (stable, keep it simple)
Host home
    ServerAliveInterval 300
    ServerAliveCountMax 20

# Cloud server (cross-continental link, more aggressive)
Host cloud
    ServerAliveInterval 30
    ServerAliveCountMax 10

# Bastion/jump host (critical path, very aggressive)
Host bastion
    ServerAliveInterval 15
    ServerAliveCountMax 60

# Everything else
Host *
    ServerAliveInterval 60
    ServerAliveCountMax 10

Gotchas

Keepalive wastes bandwidth (slightly). Every 60 seconds, a tiny packet crosses the network. On mobile or metered connections, this matters. Adjust accordingly:

# Metered/mobile: check every 5 minutes
ServerAliveInterval 300

# Office: check every 30 seconds
ServerAliveInterval 30

Server config doesn’t help if the client is the problem. If your client app closes on inactivity, no server-side config will save you. Use tmux/screen instead (they run server-side).

Zombie SSH processes. If you have lingering SSH processes, they might interfere. Clean up:

pkill -f "ssh.*prod"

Bottom Line

Add this to ~/.ssh/config and never deal with timeout drops again:

Host *
    ServerAliveInterval 60
    ServerAliveCountMax 10

That’s a 1KB config change that saves hours of frustration.

Your 2 AM self—the one who just lost an SSH session mid-critical-task—will be grateful.

Why Your SSH Connection Keeps Dropping

The Problem: Idle Connection Timeouts

Solution 1: Client-Side (Your Machine)

Solution 2: Server-Side (Remote)

Which One to Use?

TCPKeepAlive: The Confusing Option

Real-World Example

Debugging: See What’s Happening

Common Timeout Culprits

Server-Side Debugging

Comparing Options

Pro Tip: Specific Hosts

Gotchas

Bottom Line

Responses from around the web

Discussion

Related Posts

SSHFS: Ditch SCP & Access Remote Files

nftables: Modern Linux Firewalling

Sysctl Tuning: The Linux Kernel Settings Nobody Told You About

2FA for SSH and sudo via PAM