GAIA
General AI assistants benchmark.
GAIA evaluates assistants across realistic information-seeking tasks. Strong proxy for research-agent quality.
General AI assistants benchmark.
GAIA evaluates assistants across realistic information-seeking tasks. Strong proxy for research-agent quality.