Small Language Models are the Future of Agentic AI
A position paper arguing that small language models are often a better fit than large ones for agentic systems because they are cheaper, easier to deploy, and operationally better matched to repetitive tool-using workflows.
This paper makes a practical case that the default assumption in agent design should shift away from large models and toward small language models for many of the repetitive, tool-heavy steps agents actually perform. Rather than treating model scale as the main determinant of agent quality, the authors emphasize deployment realities: latency, cost, debuggability, and operational fit. Read the paper: https://arxiv.org/pdf/2506.02153
What changed. The paper reframes agent architecture around efficiency and specialization, arguing that SLMs are sufficiently capable for many agentic invocations and better aligned with modular systems.
Why it matters. For builders shipping real products, this suggests a cleaner split between “reasoning-heavy” and “routine execution” paths, which can materially reduce spend and simplify ops.
Builder takeaway. Build routing layers that let small models handle the majority of tool calls, retrieval, and state updates, while escalating only the hardest decisions to larger models.