Markov Decision Processes with Unbounded Jump Rates: A Framework for obtaining Structural Results

Home - Seminar - Markov Decision Processes with Unbounded Jump Rates: A Framework for obtaining Structural Results

List upcoming seminars by location
Amsterdam
Rotterdam

List past seminars by location
Amsterdam
Rotterdam

:-)

Operations Research Seminars Amsterdam

Speaker(s): Herman Blok (Leiden University and Technical University Eindhoven)
Date: Thursday, 14 April 2016
Location: Amsterdam

In this talk we discuss continuous time Markov decision processes (MDPs). A powerful way to derive a policy that minimizes the discounted or average cost is via the value function and the optimality equation. If a process is uniformizable the value function of the equivalent discrete time MPD can be approximated via value iteration. The value iteration algorithm can be used to show properties such as convexity that lead to a certain optimal policy.

Unbounded rate MDPs do not allow uniformization, hence discrete time tools are not directly available. One can apply a truncation to make the process uniformizable, however there are some issues: (1) The truncated processes need not have the desired properties due to boundary effects. (2) It is not guaranteed that the truncated processes approach the original model if the truncation size goes to infinity. We will discuss both issues, together these provide a framework for obtaining structural properties for non-uniformizable problems.

As a remedy for the first problem we propose the smoothed rate truncation that preserves properties and is less vulnerable for boundary effects. Some examples are presented that show that the smoothed rate truncation can also have a dramatic advantage in gaining insight via numerical calculations. In a bounded rate MDP with infinite state one needs to apply a state space truncation to get numerical results. Normal truncation can give a completely false idea of the optimal policy, while the smoothed rate truncation preserves the right policy.

For the second issue we present conditions on the Markov decision process that guarantee convergence. For the discounted cost criterion the conditions are rather mild and allow a large class of problems. Analysis of the average cost criterion is dependent on stronger conditions.

Joint work with Floske Spieksma and Sandjai Bhulai

Tinbergen

Local navigation

Article

Markov Decision Processes with Unbounded Jump Rates: A Framework for obtaining Structural Results

Operations Research Seminars Amsterdam

Related Content

Contact seminars Rotterdam

Private: Seminars Rotterdam

Contact seminars Amsterdam

Private: Christina Månsson

Contact TI lectures