Benchmark

WebArena

Web-navigation benchmark for browser agents.

Measures
Goal-directed web tasks across realistic sites
Current leader
Claude (Computer Use)

WebArena is the most-cited browser-agent benchmark. Pair it with private replay sets for production decisions.