Using an AI Agent for Code Review (Without Letting It Merge Code)
Code review is where good systems stay good and where most teams run out of time.
This post walks through a real, production-safe setup where an AI agent performs first-pass code reviews automatically when a pull request is assigned to me, inside a GitHub pipeline.
No auto-merge. No authority. Just structured feedback, grounded in the actual diff.
The Problem: Reviews Do Not Scale, Risk Does
In theory:
- Every PR gets careful review
- Standards are applied consistently
- Subtle issues are caught early
In reality:
- Reviews happen late
- Fatigue lowers quality
- “Looks good to me” sneaks in
The goal here is not to replace reviewers. It is to:
- Reduce cognitive load
- Catch obvious issues early
- Let humans focus on judgment
Design Principle: AI as Reviewer, Not Judge
Hard rules:
- AI cannot approve PRs
- AI cannot push commits
- AI cannot comment inline without context
- AI output is advisory only
Think of it as a tireless junior reviewer who reads the whole diff, applies consistent rules, and never rushes.
Trigger: When a PR Is Assigned to Me
GitHub gives us a clean signal:
A pull request review is requested from a user.
This avoids noise on every PR and reviewing code you will never touch. We trigger only when:
- review_requested
- reviewer == me
High-Level Architecture
GitHub PR Event
│
▼
GitHub Actions
│
▼
AI Review Agent
│
▼
Structured Review Comment
│
▼
Human Review
Key idea: The agent reviews diffs, not repositories.
GitHub Actions Workflow
Here is a minimal but real workflow.
name: AI Code Review
on:
pull_request:
types: [review_requested]
jobs:
ai-review:
if: github.event.requested_reviewer.login == 'your-github-username'
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get PR diff
run: |
git fetch origin ${{ github.event.pull_request.base.ref }}
git diff origin/${{ github.event.pull_request.base.ref }}...HEAD > diff.txt
- name: Run AI review agent
run: |
docker run --rm -v $PWD:/workspace ai-review-agent:latest /workspace/diff.txt
This keeps the agent isolated, stateless, and reproducible.
The AI Review Agent (Real Code)
The agent reads the diff, applies rules, and produces a structured report.
import sys
from pathlib import Path
diff = Path(sys.argv[1]).read_text()
issues = []
if "print(" in diff:
issues.append({
"severity": "low",
"message": "Debug print statements found. Consider removing before merge."
})
if "TODO" in diff:
issues.append({
"severity": "medium",
"message": "TODO comments present in diff. Confirm if intentional."
})
report = {
"summary": f"{len(issues)} potential issues found",
"issues": issues
}
print(report)
In real usage, the agent also checks test coverage changes, migration safety, config drift, and anti-patterns specific to your stack.
Posting the Review Back to GitHub
The agent does not comment directly. Instead, the pipeline posts a single, consolidated comment.
gh pr comment $PR_NUMBER --body "$(cat review_output.md)"
Why one comment?
- Less noise
- Easier to reason about
- Clear ownership
Example Output (What I Actually See)
AI Review Summary
2 potential issues found
Medium
- TODO comments present in diff. Confirm if intentional.
Low
- Debug print statements found. Consider removing before merge.
No blocking issues detected.
Why This Works in Practice
Because the agent is consistent, the scope is limited, the output is reviewable, and humans stay accountable. Bad ideas do not slip through quietly.
Guardrails I Will Not Compromise On
- No auto-approvals
- No inline commenting spam
- No learning from private repos
- No credentials beyond read-only
If an agent can merge code, it is not a reviewer: it is a risk.
Final Thought
AI does not make reviews faster. It makes them calmer. And calm reviewers make better decisions.
