Markov Decision Processes with Ex-Post Max-Min Fairness, and Generalizations

published by semadmin on Fri, 09/09/2022 - 15:00

----------------------------------------------------------------------------------------------------

Department of Systems Engineering and Engineering Management

The Chinese University of Hong Kong

----------------------------------------------------------------------------------------------------

Date: Friday Sep 16, 2022, 16:30 HK time

Venue: ERB 513, The Chinese University of Hong Kong

Title：Markov Decision Processes with Ex-Post Max-Min Fairness, and Generalizations

Speaker: Prof. Wang Chi Cheung, Department of Industrial System Engineering and Management, NUS, Singapore

Abstract:

We consider Markov decision processes (MDPs) with vectorial rewards, where the agent receives a vector of K > 1 different types of rewards at each time step. The agent aims to maximize the minimum total reward among the K reward types. Different from existing works that focus on maximizing the minimum expected total reward, i.e. ex-ante max-min fairness, we maximize the expected minimum total reward, i.e. ex-postmax-min fairness. Through an example and numerical experiments, we show that the optimal policy for the former objective generally does not converge to optimality under the latter, even as the number of time steps T grows. Our main contribution is a novel algorithm, Online-ReOpt, that achieves nearoptimality under our objective, when the underlying MDP is communicating. The expected objective value under Online-ReOpt is shown to converge to the asymptotic optimum as T increases. Finally, we propose offline variants to ease the burden of online computation in Online-ReOpt, and we propose generalizations from the max-min objective to concave utility maximization.

Biography:

Cheung Wang Chi is an assistant professor at the Department of Industrial System Engineering and Management, NUS. His research interest is on online data driven optimization, with applications to revenue management and inventory control models. He is a finalist in the George Nicholson Student Paper Competition in 2015, and a finalist of the POMSJD.com Best Data-Driven Research Paper Competition (2019). He is a recipient of the Agency for Science, Technology And Research scholarship from 2007 to 2010, and from 2011 to 2016.

Everyone is welcome to attend the talk!

SEEM-5201 Website: http://seminar.se.cuhk.edu.hk

Email: seem5201@se.cuhk.edu.hk

Date:

Friday, September 16, 2022 - 16:30

Main menu

Seminar Calendar

Main menu