Topic

production

Coverage, reference pages, tools, and guides connected to this topic.

News May 23, 2026

Q2 2026 report says agentic AI moved into production

A quarterly industry report says agentic AI shifted from experimentation to real deployment, with enterprise pilot-to-production conversion rising sharply and MCP emerging as the dominant tool-use plumbing.

agent-frameworks tool-use enterprise mcp production
Build Apr 12, 2026

Build a replay-based eval set in a weekend

How to capture, redact, and score real production sessions to evaluate agent candidates.

eval replay production
Research Apr 8, 2026

Six failure modes in tool-using agents, and the patterns that fix them

An empirical taxonomy of agent tool-use failures across 4,000 traces from production deployments. Schema drift and silent partial-failure dominate.

tool-use failure-modes production
Research Mar 30, 2026

The case for replay-based agent evaluation

Static benchmarks miss the failure modes that matter in production. This paper argues for replay sets — captured user sessions scored against a held-out outcome.

evaluation replay production