The seminar will review two subjects related to Markov decision making processes (MDP).
- Dynamic Decision Making with Binding Actions: Markov Decision Processes are extended to situations where actions are binding and remain in effect for a random number of future stages. We consider two slightly different models. In the first, a random variable determines the stages in which the decision maker is able to revise her actions and each chosen action is binding until the next revision opportunity. In the second, the random variable determines the stages in which the decision maker is able to revise her actions but she does not have to. She can decide to keep her previous action and in this case, the revision opportunity persists to the next stage and remains available until the decision maker changes her action. Only then, the action becomes binding. For both models we present a sufficient and necessary conditions for one random to yield a better utility than another regardless of the MDP. (co-authored with Yevgeny Tsodikovich)
- The Value Functions of Markov Decision Processes: It is known that the value function of a Markov decision process, as a function of the discount factor λ, is the maximum of finitely many rational functions in λ. Moreover, each root of the denominators of the rational functions either lies outside the unit ball in the complex plane, or is a unit root with multiplicity 1. We prove the converse of this result: every function which is the maximum of finitely many rational functions in λ, satisfying the property that each root of the denominators of the rational functions either lies outside the unit ball in the complex plane, or is a unit root with multiplicity 1, is the value function of some Markov decision process. We thereby provide a characterization of the set of value functions of Markov decision processes. (co-authored with Eilon Solan).
Lunch and refreshment will be provided during the seminar.