Share

Using an AI Agent for Code Review (Without Letting It Merge Code)

Code review is where good systems stay good and where most teams run out of time.

This post walks through a real, production-safe setup where an AI agent performs first-pass code reviews automatically when a pull request is assigned to me, inside a GitHub pipeline.

No auto-merge. No authority. Just structured feedback, grounded in the actual diff.

The Problem: Reviews Do Not Scale, Risk Does

In theory:

  • Every PR gets careful review
  • Standards are applied consistently
  • Subtle issues are caught early

In reality:

  • Reviews happen late
  • Fatigue lowers quality
  • “Looks good to me” sneaks in

The goal here is not to replace reviewers. It is to:

  • Reduce cognitive load
  • Catch obvious issues early
  • Let humans focus on judgment

Design Principle: AI as Reviewer, Not Judge

Hard rules:

  • AI cannot approve PRs
  • AI cannot push commits
  • AI cannot comment inline without context
  • AI output is advisory only

Think of it as a tireless junior reviewer who reads the whole diff, applies consistent rules, and never rushes.

Trigger: When a PR Is Assigned to Me

GitHub gives us a clean signal:

A pull request review is requested from a user.

This avoids noise on every PR and reviewing code you will never touch. We trigger only when:

  • review_requested
  • reviewer == me

High-Level Architecture

GitHub PR Event
      │
      ▼
GitHub Actions
      │
      ▼
AI Review Agent
      │
      ▼
Structured Review Comment
      │
      ▼
Human Review

Key idea: The agent reviews diffs, not repositories.

GitHub Actions Workflow

Here is a minimal but real workflow.

name: AI Code Review

on:
  pull_request:
    types: [review_requested]

jobs:
  ai-review:
    if: github.event.requested_reviewer.login == 'your-github-username'
    runs-on: ubuntu-latest

    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get PR diff
        run: |
          git fetch origin ${{ github.event.pull_request.base.ref }}
          git diff origin/${{ github.event.pull_request.base.ref }}...HEAD > diff.txt

      - name: Run AI review agent
        run: |
          docker run --rm -v $PWD:/workspace ai-review-agent:latest /workspace/diff.txt

This keeps the agent isolated, stateless, and reproducible.

The AI Review Agent (Real Code)

The agent reads the diff, applies rules, and produces a structured report.

import sys
from pathlib import Path

diff = Path(sys.argv[1]).read_text()

issues = []

if "print(" in diff:
    issues.append({
        "severity": "low",
        "message": "Debug print statements found. Consider removing before merge."
    })

if "TODO" in diff:
    issues.append({
        "severity": "medium",
        "message": "TODO comments present in diff. Confirm if intentional."
    })

report = {
    "summary": f"{len(issues)} potential issues found",
    "issues": issues
}

print(report)

In real usage, the agent also checks test coverage changes, migration safety, config drift, and anti-patterns specific to your stack.

Posting the Review Back to GitHub

The agent does not comment directly. Instead, the pipeline posts a single, consolidated comment.

gh pr comment $PR_NUMBER --body "$(cat review_output.md)"

Why one comment?

  • Less noise
  • Easier to reason about
  • Clear ownership

Example Output (What I Actually See)

AI Review Summary

2 potential issues found

Medium

  • TODO comments present in diff. Confirm if intentional.

Low

  • Debug print statements found. Consider removing before merge.

No blocking issues detected.

Why This Works in Practice

Because the agent is consistent, the scope is limited, the output is reviewable, and humans stay accountable. Bad ideas do not slip through quietly.

Guardrails I Will Not Compromise On

  • No auto-approvals
  • No inline commenting spam
  • No learning from private repos
  • No credentials beyond read-only

If an agent can merge code, it is not a reviewer: it is a risk.

Final Thought

AI does not make reviews faster. It makes them calmer. And calm reviewers make better decisions.