Long-horizon task
A task spanning many steps over hours or days, requiring durable state and memory.
Long-horizon is where most agent quality issues silently compound. Memory architecture and replay-based evaluation matter disproportionately here.
A task spanning many steps over hours or days, requiring durable state and memory.
Long-horizon is where most agent quality issues silently compound. Memory architecture and replay-based evaluation matter disproportionately here.