Skip to content
Go back

Using AI to Find Security Bugs in Your Code

By SumGuy 6 min read
Using AI to Find Security Bugs in Your Code

The AI Security Audit That Changed Things

In 2026, researchers used Claude Code to systematically audit open source C codebases — not looking for anything in particular, just running repeatable prompts against suspicious-looking functions. They found a 23-year-old vulnerability in a widely-used Linux utility. Twenty-three years. Thousands of security researchers, static analysis tools, and code reviews had missed it. An AI model — trained on billions of lines of code, CVE disclosures, and security research papers — spotted the pattern in minutes.

Here’s the thing: that’s not magic. That’s pattern matching at scale. And that’s exactly what your code needs.

Why AI Code Auditing Actually Works

LLMs like Claude are trained on enormous datasets of real vulnerabilities and how they’re fixed. They’ve ingested CVE databases, security research papers, GitHub issues, Stack Overflow conversations where people ask “why is this code vulnerable?” They’ve seen SQL injection 50,000 times. They recognize command injection patterns. They know which functions are footguns and which aren’t.

Static analysis tools are rule-based. They catch what you tell them to look for. AI models catch patterns they’ve learned from examples. A linter flags eval() use in Python — reasonable. But an LLM understands why eval(user_input) is catastrophic, and can reason about whether that input is actually validated somewhere upstream.

Static analysis also generates noise — lots of false positives that make developers tune out warnings. AI auditing is conversational. You ask specific questions. You get reasoned answers with examples and explanations. It’s not a red/yellow/green dashboard; it’s a code review from someone who’s read every security paper ever written.

Approach 1: Audit a Suspicious Function

The simplest technique: paste a function and ask directly.

Security Audit Prompt Template
Review this function for security vulnerabilities. Look specifically for:
- Buffer overflows or memory safety issues
- SQL injection or command injection
- Unvalidated or unsanitized user input
- Missing access control checks
- Path traversal vulnerabilities
- Hardcoded secrets or credentials
- Race conditions (if applicable)
- Use of dangerous functions (eval, exec, system, etc.)
For each issue found, explain:
1. What the vulnerability is
2. An exploit scenario
3. A specific fix
Function:
[paste code here]

This works. Seriously. Most developers don’t think systematically about security. They write code. They test it. They ship it. An AI takes 10 seconds to think through 15 different attack vectors.

Approach 2: Systematic Codebase Audit

For a whole repository or microservice, create an audit session. Start with a broad security review, then drill into specific functions.

Codebase Security Audit Prompt
I'm auditing a Python/Node/Go/[language] project for security vulnerabilities.
Repository overview:
- Purpose: [what does it do?]
- Takes user input? [yes/no - if yes, from where?]
- Manages sensitive data? [what kind?]
- Network-facing? [yes/no]
- Runs as root? [yes/no]
First pass: scan the codebase conceptually. What are the highest-risk attack surfaces?
1. User input handling (forms, APIs, CLI args)
2. Database queries
3. File system access
4. External process execution
5. Authentication and authorization
Identify the 5 riskiest functions or code sections.
Then let's audit them one by one.

Then, for each function the AI flags:

Review this function in detail:
[paste function]
Is the user input validated? Where?

Approach 3: Claude Code for Interactive Audits

If you’re using Claude Code (or Cursor), you can run an entire audit session on your codebase:

  1. Load the repo — point Claude Code at your project directory
  2. Ask targeted questions — “Show me everywhere we execute shell commands. Is the input sanitized?”
  3. Get file-specific audits — Claude Code can read your actual codebase, not guesses
  4. Interactive fixes — “Write a patch that fixes the SQL injection in auth.py

This is faster than copying-pasting. Claude Code understands the entire context.

What AI Auditing Actually Catches (Really Well)

What AI Misses (Use Tools Together)

AI is good, but it’s not a silver bullet:

Pair AI auditing with:

Pre-Commit Security Audits with LLMs

Want AI reviewing security-sensitive code automatically?

.git/hooks/pre-commit
#!/bin/bash
# Find files with security keywords
DANGEROUS=$(git diff --cached --name-only | grep -E "\.(py|js|go|rs)$")
for file in $DANGEROUS; do
DIFF=$(git diff --cached "$file")
# Send to LLM API
curl -X POST https://api.anthropic.com/v1/messages \
-H "x-api-key: $ANTHROPIC_API_KEY" \
-H "content-type: application/json" \
-d "{
\"model\": \"claude-opus-4-1-20250805\",
\"max_tokens\": 1024,
\"messages\": [{
\"role\": \"user\",
\"content\": \"Review this code change for security issues. If critical issues found, respond with REJECT. Otherwise, approve.\n\n$DIFF\"
}]
}" | grep -q "REJECT" && exit 1
done
exit 0

(Obviously, don’t block on LLM latency in real life — async this or use it as a warning, not a blocker.)

Real Example: Flask App with SQL Injection

Here’s a vulnerable Flask route:

app.py
from flask import Flask, request
import sqlite3
app = Flask(__name__)
@app.route('/user/<username>')
def get_user(username):
conn = sqlite3.connect('users.db')
cursor = conn.cursor()
# This is vulnerable!
query = f"SELECT email, role FROM users WHERE username = '{username}'"
cursor.execute(query)
user = cursor.fetchone()
return {"email": user[0], "role": user[1]}

Ask Claude Code:

Review this Flask endpoint for SQL injection. What's the attack?

It immediately flags: username parameter is unsanitized. Attacker passes admin' OR '1'='1 and dumps the whole table. Fix: use parameterized queries.

app.py (fixed)
query = "SELECT email, role FROM users WHERE username = ?"
cursor.execute(query, (username,))

Done. AI found it. You fixed it. Before it shipped.

The Real Outcome

You’re not replacing security researchers or static analysis. You’re adding a high-powered code reviewer who never sleeps, doesn’t get distracted, and has read every security paper ever written. Run an audit session before you deploy. Ask “what am I missing?” Ask it again on a refactor. Ask it before you merge that third-party dependency.

The Linux maintainers weren’t careless. They just didn’t run this particular audit. You should.


Share this post on:

Send a Webmention

Written about this post on your own site? Send a webmention and it may appear here.


Previous Post
Lazydocker & Dive: Fix Your Docker CLI
Next Post
Docker Manager Showdown: Pick One

Related Posts