MDP Discounting

Markov Decision Processes (MDPs) can model sequential decision-making problems in stochastic environments. Exponential reward discounting is used to model a decision-maker’s preference for earlier rewards. However, research in the fields of economics, psychology, and neuroscience have shown that this type of discounting is not always the most accurate for modeling human behavior. We would like to investigate the effect different types of discounting have on standard results on MDPs and related models.

Supervisors: Eline Bovy Msc and Prof. Dr. Nils Jansen.