Department of Mathematics - Seminar on Statistics and Data Science - Exponential Lower Bounds and Fast Convergence for Policy Optimization

Name: Department of Mathematics - Seminar on Statistics and Data Science - Exponential Lower Bounds and Fast Convergence for Policy Optimization
Start: 2022-10-17
End: 2022-10-17
Location: https://hkust.zoom.us/j/94883840530 (Passcode: hkust)

2022 年 10 月 17 日

11:00am - 12:00pm

https://hkust.zoom.us/j/94883840530 (Passcode: hkust)

Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標：支持以下联合国可持续发展目标：

Policy gradient (PG) methods and their variants lie at the heart of modern reinforcement learning. Due to the intrinsic non-concavity of value maximization, however, the theoretical underpinnings of PG-type methods have been limited even until recently. In this talk, we discuss both the ineffectiveness and effectiveness of nonconvex policy optimization. On the one hand, we demonstrate that the popular softmax policy gradient method can take exponential time to converge. On the other hand, we show that employing natural policy gradients and enforcing entropy regularization allows for fast global convergence.

活動形式

研討會, 演講, 講座

講者/ 表演者:

Prof. Yuting WEI

The Wharton School, University of Pennsyvania

Yuting Wei is currently an assistant professor in the Statistics and Data Science Department at the Wharton School, University of Pennsylvania. Prior to that, Yuting spent two years at Carnegie Mellon University as an assistant professor of statistics, and one year at Stanford University as a Stein Fellow. She received her Ph.D. in statistics at the University of California, Berkeley. She was the recipient of the 2022 NSF Career award, the Erich L. Lehmann Citation from the Berkeley statistics department in 2018. Her research interests include high-dimensional and non-parametric statistics, statistical machine learning, and reinforcement learning.

語言

英文

適合對象

校友

教職員

研究生

本科生

主辦單位

數學系

科學及科技