PR-AF

Open-Source Agentic PR Reviewer Built on AgentField

Output • How It Works • Comparison • Quick Start • Architecture

Other tools run a single LLM pass over the diff with a fixed checklist. PR-AF builds a custom review strategy for every PR: it examines the change, reasons about what could go wrong, spawns parallel reviewer agents with runtime-crafted prompts, challenges its own findings adversarially, and posts specific inline comments. Free, open source, one API call. A deep review of a 500-line PR costs about $0.80 in LLM calls.

One-Call DX

curl -X POST http://localhost:8080/api/v1/execute/async/pr-af.review \
  -H "Content-Type: application/json" \
  -d '{"input": {"pr_url": "https://github.com/owner/repo/pull/123"}}'

Posts inline GitHub review comments with evidence-grounded findings:

{
  "total_findings": 5,
  "by_severity": {"critical": 1, "important": 2, "suggestion": 2},
  "findings": [
    {
      "severity": "critical",
      "title": "SQL injection in user input handling",
      "file": "src/api/users.py",
      "line": 42,
      "body": "Raw query parameter interpolated directly into SQL. Tracer confirms no parameterization between input and cursor.execute().",
      "suggestion": "cursor.execute('SELECT * FROM users WHERE id = %s', (user_id,))",
      "evidence": "AST extraction confirms f-string SQL at users.py:42, no sanitization in call chain",
      "compound_risk": "Combined with missing auth middleware (finding #2), this is exploitable by unauthenticated users"
    }
  ],
  "review_dimensions": 4,
  "cost_usd": 0.83
}

Custom review strategy per PR. Evidence-grounded. Zero false positives. ~$0.80 for a 500-line PR.

Dynamic Pipeline Architecture

PR-AF does not execute a static script. It structurally morphs its own execution graph based on the topology of the incoming Pull Request.

When a PR arrives, the system dynamically compiles review dimensions — evaluating the diff through semantic, mechanical, and systemic lenses. It uses these dimensions to spawn specialized, ephemeral reviewer agents tailored exclusively to the exact context of the current PR.

Full architecture deep-dive: docs/ARCHITECTURE.md

Pipeline flow (Mermaid)

graph TD
    classDef intake fill:#f3f4f6,stroke:#4b5563,stroke-width:2px;
    classDef dynamic fill:#dbeafe,stroke:#3b82f6,stroke-width:2px;
    classDef verify fill:#fef3c7,stroke:#2563eb,stroke-width:2px;
    classDef synthesize fill:#ede9fe,stroke:#d97706,stroke-width:2px;
    classDef output fill:#ecfdf5,stroke:#8b5cf6,stroke-width:2px;

    PR[Incoming Pull Request] --> I1[Intake Triage]:::intake
    I1 --> A1[Topological Anatomy Mapping]:::intake
    
    A1 --> M1[Semantic Lens Generator]:::dynamic
    A1 --> M2[Mechanical Lens Generator]:::dynamic
    A1 --> M3[Systemic Lens Generator]:::dynamic
    
    M1 --> D[Dimension Deduplication & Compilation]:::dynamic
    M2 --> D
    M3 --> D
    
    D -->|Dynamically spawns N dimensions| R1(Thread 1: State Mutation)
    D --> R2(Thread 2: API Boundaries)
    D --> R3(Thread N: Dynamic Context...)
    
    R1 --> E[Programmatic AST Extraction Engine]:::verify
    R2 --> E
    R3 --> E
    
    E -->|Ground truth caller snippets| V[Evidence Verification Layer]:::verify
    V -->|Unsubstantiated claims pruned| F[Falsifiability Gate]:::verify
    
    F --> C1(Compound Cluster: File Topology)
    F --> C2(Compound Cluster: Shared Imports)
    F --> C3(Compound Cluster: Tag Overlap)
    
    C1 --> S[Compound Vulnerability Synthesis]:::synthesize
    C2 --> S
    C3 --> S
    
    S --> L{Coverage Depth Gate}
    L -->|Blind spots detected| I1
    L -->|Full coverage achieved| O[Synthesized GitHub Annotations]:::output

How It Works

PR-AF uses this multi-phase cognitive pipeline to ensure rigorous, high-fidelity reviews:

1. Evidence Grounding (0% False Positives)

Language models inherently operate on probability, which leads to assumption-based false positives. If the system flags a missing validation check, PR-AF does not immediately accept it. Instead, it utilizes programmatic AST (Abstract Syntax Tree) extraction to pull the exact caller snippets and import contexts from the broader repository. This raw data is then evaluated through an isolated verification layer. If the initial claim cannot be irrefutably grounded in the extracted code, it is silently pruned.

2. Compound Vulnerability Synthesis

Standard tools analyze code linearly. PR-AF looks at the entire board to identify cross-correlated risks. It clusters isolated, seemingly minor anomalies across different files and evaluates them concurrently to detect whether they coalesce into a larger systemic exploit. For example, identifying an unprotected API key in one module and a database merge vulnerability in another will be synthesized into a single, high-severity "Coordinated Injection" finding.

3. Falsifiability Gates

Before any finding is compiled into the final GitHub comment, it must pass through a strict falsifiability framework. The system actively attempts to invalidate its own findings—searching for reasons why the reported anomaly might be safe, intended behavior, or securely mitigated elsewhere in the codebase structure. Only findings that survive this aggressive auto-invalidation process are surfaced to the developer.

Ecosystem Comparison

There are excellent AI code review tools on the market. PR-AF is not designed to replace fast, interactive tools; it is designed for comprehensive CI/CD gating where accuracy and architectural depth matter more than execution speed.

Feature	PR-AF (AgentField)	Claude Code CLI	Commercial SaaS (e.g. Codex, CodeRabbit)
Best For	Deep CI/CD architectural audits	Fast, iterative inner-loop development	Clean GitHub UX and chat-based reviews
Cost	Free / Open Source (BYOK API costs only)	Pay-per-token (BYOK)	~$20 - $25 / user / month
Architecture	Massively parallel cognitive pipeline	Single-thread interactive loop	Context retrieval + LLM review
Execution Time	~35-50 minutes	Seconds to minutes	~2-5 minutes
False Positives	Extremely low (Evidence Grounding)	Moderate (relies on context window)	Low-to-Moderate (heuristic filtering)
Compound Risks	Yes (Dedicated Compound Synthesizer)	Unlikely (diff-focused)	Partial (depends on retrieval accuracy)

We highly recommend using Claude Code for your local development and running PR-AF as your final GitHub Actions gatekeeper.

Quick Start

git clone https://github.com/Agent-Field/pr-af.git && cd pr-af
cp .env.example .env          # Add OPENROUTER_API_KEY, GH_TOKEN
docker compose up --build

Starts AgentField control plane (http://localhost:8080) + PR-AF agent.

curl -X POST http://localhost:8080/api/v1/execute/async/pr-af.review \
  -H "Content-Type: application/json" \
  -d '{"input": {"pr_url": "https://github.com/owner/repo/pull/123"}}'

Poll for results:

curl http://localhost:8080/api/v1/executions/<execution_id>

GitHub Actions Integration

The easiest way to use PR-AF is to drop it into your GitHub Actions. It requires zero configuration and runs securely using GitHub's built-in GITHUB_TOKEN.

Add this workflow to your repository at .github/workflows/pr-af-review.yml. It triggers automatically whenever you add the pr-af label to a Pull Request.

name: AgentField PR Review

on:
  pull_request:
    types: [labeled]

jobs:
  pr-af-review:
    if: github.event.label.name == 'pr-af'
    runs-on: ubuntu-latest

    # Needs permissions to post comments and read code
    permissions:
      contents: read
      pull-requests: write

    steps:
      - name: Checkout PR-AF
        uses: actions/checkout@v4
        with:
          repository: Agent-Field/pr-af
          path: pr-af

      - name: Start AgentField & PR-AF
        working-directory: ./pr-af
        env:
          OPENROUTER_API_KEY: ${{ secrets.OPENROUTER_API_KEY }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
        run: |
          docker compose up -d
          sleep 15 # Wait for services to be healthy

      - name: Execute Deep Architectural Audit
        working-directory: ./pr-af
        env:
          PR_URL: ${{ github.event.pull_request.html_url }}
        run: |
          python3 scripts/ci_runner.py

Note: PR-AF runs a comprehensive parallel pipeline. Reviews typically take 35-50 minutes depending on PR complexity.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github/workflows		.github/workflows
assets		assets
benchmark		benchmark
docs		docs
scripts		scripts
src/pr_af		src/pr_af
.DS_Store		.DS_Store
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
main.py		main.py
pyproject.toml		pyproject.toml
railway.toml		railway.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PR-AF

Open-Source Agentic PR Reviewer Built on AgentField

One-Call DX

Dynamic Pipeline Architecture

How It Works

1. Evidence Grounding (0% False Positives)

2. Compound Vulnerability Synthesis

3. Falsifiability Gates

Ecosystem Comparison

Quick Start

GitHub Actions Integration

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

PR-AF

Open-Source Agentic PR Reviewer Built on AgentField

One-Call DX

Dynamic Pipeline Architecture

How It Works

1. Evidence Grounding (0% False Positives)

2. Compound Vulnerability Synthesis

3. Falsifiability Gates

Ecosystem Comparison

Quick Start

GitHub Actions Integration

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages