Towards Understanding Interactive Learning
Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標:支持以下联合国可持续发展目标:
The success of modern machine learning is driven in large part by the interactive nature of the learning process, in which an agent learns by interacting with its environment. In this talk, I focus on two central pillars of interactive learning: sequential prediction and reinforcement learning.
In the first part, I examine the sequential prediction setting, which underlies the pre-training procedure in language model training. I introduce a new complexity measure for model classes, the sequential square-root entropy, and show that this notion completely characterizes the sample complexity of sequential prediction.
In the second part, I study reinforcement learning with outcome feedback, a setting that underpins the post-training stage of language models. I show that process supervision and outcome feedback can be transformed into one another at minimal additional cost, revealing a fundamental equivalence between these two paradigms.
I conclude by outlining a vision for a structural theory of machine learning that bridges the gap between theoretical guarantees and practical deployment.
Zeyu Jia is a final-year PhD student in the Department of Electrical Engineering and Computer Science at MIT, where he is affiliated with the Laboratory for Information and Decision Systems (LIDS). Prior to joining MIT, he received his bachelor's degree from the School of Mathematical Sciences at Peking University. His research interests include machine learning theory, with a focus on reinforcement learning, statistics, and information theory.