Features4 min read

Understanding Risk Scores

Learn how MergeShield's AI analyzes pull requests across 6 risk dimensions and produces a calibrated risk score.

How risk scoring works

Interactive walkthrough · 5 steps · 35 seconds

1Step 1 of 5

Two-Stage AI Pipeline

Every PR is analyzed by two AI models in sequence — Claude Sonnet for risk scoring, then Claude Haiku for reasoning extraction.

Two AI models

Sonnet for risk scoring, Haiku for reasoning extraction.

How Scoring Works

MergeShield uses a two-stage AI pipeline to analyze every pull request.

In Stage 1, Claude evaluates the PR diff, changed files, and commit history using a structured tool_use schema that guarantees well-formed JSON output. The AI scores the PR across six risk dimensions, each on a 0–100 scale, and provides a written explanation for each score.

In Stage 2, a separate reasoning extraction pass produces a structured log of the AI's decision-making process. This reasoning log is stored alongside the analysis and displayed in the dashboard so you can understand exactly *why* a particular score was assigned.

The overall risk score is computed using a weighted formula:

overall = (max_dimension × 0.4) + (avg_all_dimensions × 0.6)

This design ensures that a single critical finding (like a security vulnerability) cannot be diluted by low scores in other dimensions, while still reflecting the overall complexity profile of the change.

The 6 Risk Dimensions

Each pull request is evaluated across six independent dimensions:

Complexity measures how difficult the change is to understand and review. Simple one-line fixes score low, while large refactors touching many interconnected components score high. The AI considers lines changed, cyclomatic complexity, number of files, and logical dependencies.

Security evaluates whether the change introduces potential vulnerabilities. This includes:

Authentication and authorization changes
Input validation and secret handling
Dependency updates with known CVEs
Indirect implications like changing CORS configuration or cookie settings

Blast Radius assesses how widely a change could affect the system if something goes wrong. A change to a utility function used across the entire codebase has a larger blast radius than a change to an isolated component. The AI considers import graphs, shared state, and infrastructure-level changes.

Test Coverage looks at whether the change includes appropriate tests. Adding a new feature without corresponding tests increases risk. The AI checks for test files in the diff, evaluates existing test coverage, and flags untested edge cases.

Breaking Changes evaluates the potential for backward-incompatible modifications — API contract changes, database schema migrations, removed exports, and configuration format changes. The AI is particularly attentive to changes in public interfaces.

Overall provides a holistic assessment that may weigh factors not fully captured by individual dimensions, such as the combination of moderate risks across multiple areas.

Score Calibration

Risk scores are calibrated using anchored examples that map score ranges to concrete types of changes:

0–10 — Trivial: README updates, typo fixes, config value adjustments
10–25 — Low: Simple bug fixes, minor refactors
25–50 — Medium: New features, moderate complexity, multi-file changes
50–75 — High: Security-sensitive changes, significant API modifications, large-scale refactors
75–100 — Critical: Infrastructure changes, authentication rewrites, production database migrations

The AI also receives context about the repository and author to improve calibration accuracy. Specifically, the pipeline passes:

The full repository name (e.g., myorg/production-api)
The author type (human, agent, or bot)
The agent slug if an AI agent was detected (e.g., claude-code)

This contextual information helps the AI distinguish between, for example, a change to a test repository versus the same change in a production service.

Calibration is ongoing — the analysis feedback mechanism on the PR detail page lets you mark scores as too high or too low, helping your team track calibration quality over time.

Tip

A change in a test repo will generally score lower than the same change in a production infrastructure repo. The AI uses the repository name and context to calibrate appropriately.

File-Level Attribution

Beyond the overall PR score, MergeShield provides file-level risk attribution. Each file in the PR diff receives:

Its own risk score (0–100)
A primary risk category (one of the six dimensions)
A brief explanation of why that file contributes risk

File-level risks are displayed in both the GitHub PR comment and the MergeShield dashboard. In the PR comment, files are listed with their individual scores so reviewers can prioritize without leaving GitHub. On the dashboard, the PR detail page shows a sortable list with scores, categories, and explanations.

This feature is especially valuable for large PRs where the overall score might be driven by one or two sensitive files buried among many low-risk changes. Instead of reviewing every file equally, your team can jump straight to the files flagged as high risk.

File-Level Risks

FileScoreCategory

src/auth/jwt.ts

Security

src/middleware/auth.ts

Complexity

src/config/env.ts

Blast Radius

test/auth.test.ts

Test Coverage

The GitHub PR Comment

After analysis completes, MergeShield automatically posts a comment on the GitHub pull request. The comment includes:

The overall risk level and score
Dimension-by-dimension breakdowns
File-level risk attribution
Details about the detected agent (if applicable)

The comment is formatted for quick scanning so reviewers can assess risk without leaving GitHub.

If the PR is updated (new commits pushed), MergeShield re-analyzes and posts an updated comment reflecting the new changes. You can also view the full analysis with additional details on the MergeShield dashboard — including the AI reasoning log, auto-merge evaluation, approval status, and policy adjustments that may not be shown in the condensed GitHub comment.

Warning

Comments are posted by the MergeShield GitHub App bot, not your personal account. Ensure your team is aware so they are not surprised by automated comments on pull requests.

Quick Start

Agent Detection & Trust

← Back to All Guides