Test-Time Cognition for Agentic Tasks
We introduce Dynamic Compute Allocation (DCA) and evaluate it on agentic coding tasks, across pure and composite model configurations, on Terminal-Bench 2.0.
Notes from our journey to understand the computational principles of intelligence
We introduce Dynamic Compute Allocation (DCA) and evaluate it on agentic coding tasks, across pure and composite model configurations, on Terminal-Bench 2.0.
We hold model weights and prompts fixed, and explore everything else. Our research discovers levers within LLMs that improve reasoning at test time, guided by computational principles drawn from neuroscience.
AI today encodes intelligence into weights and retrieves it at inference. Hard problems demand more: active reasoning that balances accuracy with efficiency, and knows when to think harder.
How a question about brain wiring led to better neural networks, and eventually a company mission.
Principles from Neuroscience Give Robots Behaviors They Were Never Trained For
Each capability is a brick that builds upon previous ones
This lets us ensure we optimize for all design constraints & objectives
The total effect aims to be greater than the sum of each component
Evolution has developed intelligence that is efficient and works in the real world, there are lessons there to be learnt
The more we understand why biology does what it does at a systems level, the more we will understand how to build better AI systems
We also are cognizant of the fact that biology has limitations in its design & implementation, that we strive to avoid
We believe it's hard to expand an optimized system's design to include effectiveness for additional goals and constraints.
We address this by deciding on all goals and constraints for our final system, and incorporating as many as possible at each step.
We avoid preclusion at every stage for unoptimized goals & constraints, allowing easier addition down the line.