Now in early access — first 200 teams free

Catch AI agent quality drift
before users do.

Your AI agent looked great at launch. But quality degrades silently — hallucination rates creep up, tone shifts, accuracy drops. Agent SPC uses statistical process control to detect drift 4–8 days before it becomes a user problem.

Connect your agent. See your quality baseline today. What is agent drift?
customer-support-v3 · Accuracy Score · last 18 days Run rule violation — drift detected
UCL CL LCL 96.2% 91.4% 86.6% D1 D3 D5 D7 D9 D11 D13 D15 DRIFT DETECTED Run rule: 7 pts below CL ↑ caught before users notice
14,000+ quality checks run daily
4.2 days avg. drift detected before user reports
78% of quality issues caught before impact
6 metrics monitored per agent

Three steps to a quality baseline

SPC has been used in manufacturing for 90 years to catch process drift before defects reach customers. We bring the same rigour to AI agent outputs.

1

Connect your agent

Add our lightweight SDK (Python, Node, REST) or pipe your existing evaluation scores. Works with any LLM framework — LangChain, AutoGen, custom chains.

2

Establish your quality baseline

Agent SPC samples your agent's outputs and computes your control limits — upper, center, and lower — across accuracy, tone, hallucination rate, and more.

3

Get alerts when rules fire

Receive Slack/email alerts the moment Western Electric or Nelson rules trigger — run violations, trends, or single points beyond 3σ. Before users notice anything.

Six quality dimensions — all on control charts

Each metric gets its own X-bar chart, with your agent's historical baseline as the center line and 3σ control limits computed from your own production data.

Hallucination Rate

Track the proportion of responses containing unverifiable or fabricated claims. Detect creep from 0.8% to 2.1% before it becomes a support incident.

FACTUALITY

Tone Deviation

Measure drift from your target tone profile (professional, empathetic, concise). Catch when your support agent starts sounding curt or overly informal.

STYLE

Task Accuracy

Score responses against your golden test set. SPC control charts make it easy to see when accuracy is trending — even before a single point leaves limits.

CORRECTNESS

Response Length Drift

Detect when verbosity creeps up (or down). Overly terse responses often correlate with hallucination spikes; overly verbose ones with context window pressure.

VERBOSITY

Latency Trends

Response time is a quality signal. Latency creep can indicate upstream model load, token bloat, or retrieval index degradation — SPC catches the trend early.

PERFORMANCE

Citation & Grounding Rate

For research and coding agents — track what fraction of claims are grounded in retrieved context. A dropping grounding rate is an early hallucination warning.

GROUNDING

The same method that keeps planes in the air — for your AI agents

Statistical Process Control was invented by Walter Shewhart at Bell Labs in 1924. For 100 years, it's been the standard way to detect process drift in manufacturing, aviation, and pharmaceuticals.

The insight: you don't need to see a defect to know the process is drifting. Statistical signals — runs, trends, and points near control limits — appear days or weeks before actual failures. The same logic applies perfectly to LLM outputs.

Read: SPC for AI — the full explainer
Western Electric Rule 2

2 out of 3 consecutive points in Zone A (beyond 2σ from center) — signals a shift before any breach.

Nelson Rule 3 (Run Rule)

6 or more consecutive points steadily increasing or decreasing — detects gradual accuracy degradation weeks early.

Shewhart Rule 1

Any single point beyond 3σ limits — the classic signal for an acute quality event requiring immediate investigation.

What your quality dashboard looks like

At-a-glance status across all monitored metrics. Green = in control. Amber = run rule active. Red = control limit breached.

94.1%
Task Accuracy
In control
1.8%
Hallucination Rate
Run rule active
0.82
Tone Score
In control
312
Avg Response Tokens
+12% trend
1.4s
P50 Latency
In control
71%
Grounding Rate
Downward run

Start free. Scale when you need to.

No credit card required to start. Connect one agent and see your quality baseline immediately.

Free
$0/mo
One agent, 30-day window. Perfect for evaluating Agent SPC on a production agent.
  • 1 agent monitored
  • 30-day quality window
  • All 6 quality metrics
  • Dashboard access
  • Slack/email alerts
Get started free
Enterprise
Custom
Unlimited agents, custom retention, SSO, SLA, and white-glove onboarding.
  • Unlimited agents
  • Custom retention window
  • SSO / SAML
  • Dedicated SLA
  • On-prem deployment
Contact sales
All articles →

Quality thinking for AI teams

Your quality baseline is waiting.

Connect your agent today. See your first control chart in under 10 minutes. No credit card required.

Connect your agent. See your quality baseline today.