Department of Mathematics - Seminar on Statistics and Data Science - Exponential Lower Bounds and Fast Convergence for Policy Optimization

Name: Department of Mathematics - Seminar on Statistics and Data Science - Exponential Lower Bounds and Fast Convergence for Policy Optimization
Start: 2022-10-17
End: 2022-10-17
Location: https://hkust.zoom.us/j/94883840530 (Passcode: hkust)

2022 年 10 月 17 日

11:00am - 12:00pm

https://hkust.zoom.us/j/94883840530 (Passcode: hkust)

Policy gradient (PG) methods and their variants lie at the heart of modern reinforcement learning. Due to the intrinsic non-concavity of value maximization, however, the theoretical underpinnings of PG-type methods have been limited even until recently. In this talk, we discuss both the ineffectiveness and effectiveness of nonconvex policy optimization. On the one hand, we demonstrate that the popular softmax policy gradient method can take exponential time to converge. On the other hand, we show that employing natural policy gradients and enforcing entropy regularization allows for fast global convergence.

活动形式

研讨会, 演讲, 讲座

讲者/ 表演者:

Prof. Yuting WEI

The Wharton School, University of Pennsyvania

Yuting Wei is currently an assistant professor in the Statistics and Data Science Department at the Wharton School, University of Pennsylvania. Prior to that, Yuting spent two years at Carnegie Mellon University as an assistant professor of statistics, and one year at Stanford University as a Stein Fellow. She received her Ph.D. in statistics at the University of California, Berkeley. She was the recipient of the 2022 NSF Career award, the Erich L. Lehmann Citation from the Berkeley statistics department in 2018. Her research interests include high-dimensional and non-parametric statistics, statistical machine learning, and reinforcement learning.

语言

英文

适合对象

校友

教职员

研究生

本科生

主办单位

数学系

科学及科技

Event Format

Organizer

Interest Areas

Audiences

新增活动

请各校内团体将活动发布至大学活动日历。

登入

新用户？请按此申请开户

更改用户喜好

Department of Mathematics - Seminar on Statistics and Data Science - Exponential Lower Bounds and Fast Convergence for Policy Optimization

2025年 四月

2025年四月