Toward Realistic Reinforcement Learning: from Agnositic RL to Outcome-Based RL

Name: Toward Realistic Reinforcement Learning: from Agnositic RL to Outcome-Based RL
Start: 2026-03-09
End: 2026-03-09
Location: ZOOM: https://hkust.zoom.us/j/96166511010 Meeting ID: 961 6651 1010 Passcode: 291151

2026 年 3 月 9 日

9:30am - 10:30am

ZOOM: https://hkust.zoom.us/j/96166511010 Meeting ID: 961 6651 1010 Passcode: 291151

Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標：支持以下联合国可持续发展目标：

The empirical success of reinforcement learning raises a fundamental question: do classical RL assumptions reflect real-world learning problems? Many successful algorithms were designed under idealized settings that may not capture actual deployment scenarios. I propose a research paradigm that systematically bridges classical RL frameworks to more realistic settings.

In the first part, I examine agnostic reinforcement learning, where the optimal policy may not belong to the hypothesis class. I present provably efficient algorithms and characterize when agnostic RL becomes statistically tractable.

In the second part, I explore the relationship between process-based and outcome-based RL. I demonstrate that these paradigms can be transformed into one another with minimal additional cost, revealing fundamental connections between process supervision and outcome feedback.

I conclude by outlining my vision for developing a structural understanding of reinforcement learning that bridges the divide between theoretical guarantees and practical deployment.

活動形式

研討會, 演講, 講座

講者/ 表演者:

Mr. Zeyu Jia

Mr. Zeyu JIA PhD Candidate, Department of Electrical Engineering and Computer Science Massachusetts Institute of Technology (MIT)

Zeyu Jia is a final-year PhD student in the Department of Electrical Engineering and Computer Science at MIT, where he is affiliated with the Laboratory for Information and Decision Systems (LIDS). Prior to joining MIT, he received his bachelor's degree from the School of Mathematical Sciences at Peking University. His research interests include machine learning theory, with a focus on reinforcement learning, statistics, and information theory.

語言

英文

適合對象

教職員

研究生

主辦單位

電子及計算機工程學系

工程學