- Seminar Calendar
- Seminar Archive
- 2024-2025 Semester 1
- 2023-2024 Semester 2
- 2023-2024 Semester 1
- 2022-2023 Semester 2
- 2022-2023 Semester 1
- 2021-2022 Semester 2
- 2021-2022 Semester 1
- 2020-2021 Semester 2
- 2020-2021 Semester 1
- 2019-2020 Semester 2
- 2019-2020 Semester 1
- 2018-2019 Semester 2
- 2018-2019 Semester 1
- 2017-2018 Semester 2
- 2017-2018 Semester 1
- 2016-2017 Semester 2
- 2016-2017 Semester 1
- 2015-2016 Semester 1
- 2015-2016 Semester 2
- 2014-2015 Semester 2
- 2014-2015 Semester 1
- 2013-2014 Semester 2
- 2013-2014 Semester 1
- 2012-2013 Semester 2
- 2012-2013 Semester 1
- 2011-2012 Semester 2
- 2011-2012 Semester 1
- 2010-2011 Semester 2
- 2010-2011 Semester 1
- 2009-2010 Semester 2
- 2009-2010 Semester 1
- 2008-2009 Semester 2
- 2008-2009 Semester 1
- 2007-2008 Semester 2
- 2007-2008 Semester 1
- 2006-2007 Semester 2
- 2006-2007 Semester 1
- 2005-2006 Semester 2
- 2005-2006 Semester 1
- Contact
- Site Map
Seminar: Learning in Linear-quadratic Framework: From Single-agent to Multi-agent, and to Mean-field
----------------------------------------------------------------------------------------------------
Department of Systems Engineering and Engineering Management
The Chinese University of Hong Kong
----------------------------------------------------------------------------------------------------
Date:Friday, February 11, 2022, 1:00 pm HKT
Title: Learning in Linear-quadratic Framework: From Single-agent to Multi-agent, and to Mean-field
Speaker: Professor Renyuan Xu, University of Southern California
Abstract:
Linear-quadratic (LQ) framework is widely studied in the literature of stochastic control, game theory, and mean-field analysis due to its simple structure, tractable solution, and local approximation power to nonlinear control problems. In this talk, we discuss several theoretical results of the policy gradient (PG) method, a popular reinforcement learning algorithm, for several LQ problems where agents are assumed to have limited information about the stochastic system. In the single-agent setting, we explain how the PG method is guaranteed to learn the global optimal policy. In the multi-agent setting, we show that (a modified) PG method could guide agents to find the Nash equilibrium solution provided there is a certain level of noise in the system. The noise can either come from the underlying dynamics or carefully designed explorations from the agents. Finally when the number of agents goes to infinity, we propose an exploration scheme with entropy regularization that could help each individual agent to explore the unknown system as well as the behavior of other agents. The proposed scheme is shown to be able to speed up and stabilize the learning procedure.
The numerical performance of PG methods is demonstrated with two examples, one is the optimal execution problem under the single-agent setting and the other one is the institutional negotiation/bargaining problem under the multi-agent setting.
This talk is based on several projects with Xin Guo (UC Berkeley), Ben Hambly (U of Oxford), Huining Yang (U of Oxford), and Thaleia Zariphopoulou (UT Austin).
Biography:
Renyuan Xu is currently a WiSE Gabilan Assistant Professor in the Epstein Department of Industrial and Systems Engineering at the University of Southern California. Before joining USC, Renyuan spent two years as a Hooke Research Fellow in the Mathematical Institute at the University of Oxford. She completed her Ph.D. degree in Operations Research from UC Berkeley in 2019. Her research interests lie broadly in the span of machine learning, stochastic control, game theory, and mathematical finance.
Date:
Friday, February 11, 2022, 1:00 pm HKT
Date:
Friday, February 11, 2022 - 13:00