Civil Engineering Departmental Seminar - On the Value Functions in Reinforcement Learning for Ridesharing
On the Value Functions in Reinforcement Learning for Ridesharing
A focal point of reinforcement learning for ridesharing is the value function learning. In this talk, we will recapitulate the RL methods based on offline learning, and then talk about recently deployed works that shift toward online on-policy updates. We will discuss reward representation in this two-sided marketplace and demonstrate how shared value functions can be adopted to coordinate multiple rideshare levers. Finally, we will discuss learning the value functions for individual market participating units (both supply and demand) while making sure that they collectively approximate the system values well.
Dr. Tony Qin is Principal Scientist at Lyft, working on core problems in ridesharing marketplace optimization. Previously, he was Principal Research Scientist and Director of the Decision Intelligence group at DiDi AI Labs and Staff Scientist in supply chain and inventory optimization at Walmart Global E-commerce. Tony received his Ph.D. in Operations Research from Columbia University. His research interests span optimization and machine learning, with a particular focus in reinforcement learning and its applications in operational optimization, digital marketing, and smart transportation. He is Associate Editor of the ACM Journal on Autonomous Transportation Systems. He has published more than 40 papers in top-tier conferences and journals in machine learning and optimization. He has served as Area Chair/Senior PC of KDD, AAAI, and ECML-PKDD, and a referee of top journals. He is an INFORMS Franz Edelman Award Finalist and Laureate in 2023, received the INFORMS Daniel H. Wagner Prize for Excellence in Operations Research Practice in 2019 and was selected for the NeurIPS 2018 Best Demo Awards. Tony holds more than 10 US patents in intelligent transportation, supply chain, and recommendation systems.