VNI Seminar Series - ASearcher: Large-Scale End-to-End RL Training for Search Agents

Name: VNI Seminar Series - ASearcher: Large-Scale End-to-End RL Training for Search Agents
Start: 2025-10-28
End: 2025-10-28

2025 年 10 月 28 日

10:00am - 11:00am

Room 2042, the HKUST Jockey Club Institute for Advanced Study (IAS)

Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標：支持以下联合国可持续发展目标：

In the ASearcher project, we demonstrate that large-scale end-to-end reinforcement learning can enable strong agent capabilities on complex search tasks, even with a minimalist agent design and a single open-source model. ASearcher first generates high-quality reinforcement learning data through a synthetic agent workflow. Then, leveraging the AReaL framework, it performs large-scale asynchronous RL training, achieving up to 128 agent–environment interactions per prompt during training for sufficient exploration. After RL training with a 32B model, ASearcher achieved scores of GAIA 58.1, xBench 51.1, and Frames 74.5 using only basic search tools, and can be further boosted at test time to outperform OpenAI DeepResearch and Kimi-Researcher, suggesting the great potential of RL scaling for agentic tasks.
The project is available at: https://github.com/inclusionAI/ASearcher/

场地资讯

ias_map_0.pdf

活动形式

研讨会, 演讲, 讲座

讲者/ 表演者:

Professor Yi Wu

Interdisciplinary Information Sciences (IIIS), Tsinghua University

Yi Wu is an assistant professor at the Institute for Interdisciplinary Information Sciences (IIIS), Tsinghua University. He obtained his Ph.D. from UC Berkeley and was a researcher at OpenAI from 2019 to 2020. His research focuses on reinforcement learning, multi-agent learning and LLM agent. His representative works include the value iteration network, the MADDPG/MAPPO algorithm, OpenAI's hide-and-seek project, and the AReaL project. He received the best paper award at NIPS 2016, the best demo award finalist at ICRA 2024, and the 2025's MIT TR35 Asia Pacific Award.

语言

英文

主办单位

Von Neumann Institute