- Seminar Calendar
- Seminar Archive
- 2024-2025 Semester 1
- 2023-2024 Semester 2
- 2023-2024 Semester 1
- 2022-2023 Semester 2
- 2022-2023 Semester 1
- 2021-2022 Semester 2
- 2021-2022 Semester 1
- 2020-2021 Semester 2
- 2020-2021 Semester 1
- 2019-2020 Semester 2
- 2019-2020 Semester 1
- 2018-2019 Semester 2
- 2018-2019 Semester 1
- 2017-2018 Semester 2
- 2017-2018 Semester 1
- 2016-2017 Semester 2
- 2016-2017 Semester 1
- 2015-2016 Semester 1
- 2015-2016 Semester 2
- 2014-2015 Semester 2
- 2014-2015 Semester 1
- 2013-2014 Semester 2
- 2013-2014 Semester 1
- 2012-2013 Semester 2
- 2012-2013 Semester 1
- 2011-2012 Semester 2
- 2011-2012 Semester 1
- 2010-2011 Semester 2
- 2010-2011 Semester 1
- 2009-2010 Semester 2
- 2009-2010 Semester 1
- 2008-2009 Semester 2
- 2008-2009 Semester 1
- 2007-2008 Semester 2
- 2007-2008 Semester 1
- 2006-2007 Semester 2
- 2006-2007 Semester 1
- 2005-2006 Semester 2
- 2005-2006 Semester 1
- Contact
- Site Map
Markov Decision Processes with Ex-Post Max-Min Fairness, and Generalizations
----------------------------------------------------------------------------------------------------
Department of Systems Engineering and Engineering Management
The Chinese University of Hong Kong
----------------------------------------------------------------------------------------------------
Date: Friday Sep 16, 2022, 16:30 HK time
Venue: ERB 513, The Chinese University of Hong Kong
Title:Markov Decision Processes with Ex-Post Max-Min Fairness, and Generalizations
Speaker: Prof. Wang Chi Cheung, Department of Industrial System Engineering and Management, NUS, Singapore
Abstract:
We consider Markov decision processes (MDPs) with vectorial rewards, where the agent receives a vector of K > 1 different types of rewards at each time step. The agent aims to maximize the minimum total reward among the K reward types. Different from existing works that focus on maximizing the minimum expected total reward, i.e. ex-ante max-min fairness, we maximize the expected minimum total reward, i.e. ex-postmax-min fairness. Through an example and numerical experiments, we show that the optimal policy for the former objective generally does not converge to optimality under the latter, even as the number of time steps T grows. Our main contribution is a novel algorithm, Online-ReOpt, that achieves nearoptimality under our objective, when the underlying MDP is communicating. The expected objective value under Online-ReOpt is shown to converge to the asymptotic optimum as T increases. Finally, we propose offline variants to ease the burden of online computation in Online-ReOpt, and we propose generalizations from the max-min objective to concave utility maximization.
Biography:
Cheung Wang Chi is an assistant professor at the Department of Industrial System Engineering and Management, NUS. His research interest is on online data driven optimization, with applications to revenue management and inventory control models. He is a finalist in the George Nicholson Student Paper Competition in 2015, and a finalist of the POMSJD.com Best Data-Driven Research Paper Competition (2019). He is a recipient of the Agency for Science, Technology And Research scholarship from 2007 to 2010, and from 2011 to 2016.
Everyone is welcome to attend the talk!
SEEM-5201 Website: http://seminar.se.cuhk.edu.hk
Email: seem5201@se.cuhk.edu.hk
Date:
Friday, September 16, 2022 - 16:30