Google Research Releases ReasoningBank: AI Agents Learn Reasoning Strategies from Success and Failure

Gate News message, April 22 — Google Research released ReasoningBank, an agent memory framework that enables large language model-driven agents to continuously learn after deployment. The framework extracts universal reasoning strategies from both successful and failed task experiences, storing them in a memory bank for retrieval and execution on similar future tasks. The associated paper was published at ICLR, and code has been open-sourced on GitHub.

ReasoningBank improves upon two existing approaches: Synapse, which records complete action trajectories but has limited transferability due to fine-grained granularity, and Agent Workflow Memory, which only learns from successful cases. ReasoningBank makes two key changes: storing “reasoning patterns” instead of “action sequences,” with each memory containing structured fields for title, description, and content; and incorporating failure trajectories into learning. The framework uses a model to self-evaluate execution trajectories, transforming failure experiences into anti-pitfall rules. For example, the rule “click Load More button when seen” evolves into “verify current page identifier first, avoid infinite scrolling loops, then click load more.”

The paper also introduces Memory-aware Test-time Scaling (MaTTS), which allocates additional compute during inference to explore multiple trajectories and store findings in the memory bank. Parallel expansion runs multiple distinct trajectories for the same task, refining more robust strategies through self-comparison; sequential expansion iteratively refines a single trajectory, storing intermediate reasoning in memory.

On WebArena browser tasks and SWE-Bench-Verified coding tasks using Gemini 2.5 Flash as a ReAct agent, ReasoningBank achieved 8.3% higher success rate on WebArena and 4.6% higher on SWE-Bench-Verified compared to a baseline without memory, reducing average steps per task by approximately 3. Adding MaTTS with parallel expansion (k=5) further improved WebArena success rate by 3 percentage points and reduced steps by an additional 0.4.

Disclaimer: The information on this page may come from third parties and does not represent the views or opinions of Gate. The content displayed on this page is for reference only and does not constitute any financial, investment, or legal advice. Gate does not guarantee the accuracy or completeness of the information and shall not be liable for any losses arising from the use of this information. Virtual asset investments carry high risks and are subject to significant price volatility. You may lose all of your invested principal. Please fully understand the relevant risks and make prudent decisions based on your own financial situation and risk tolerance. For details, please refer to Disclaimer.
Comment
0/400
No comments