Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

Robust Control of Markov Decision Processes with Uncertain Transition Matrices

Arnab Nilim and Laurent El Ghaoui
Operations Research
Vol. 53, No. 5 (Sep. - Oct., 2005), pp. 780-798
Published by: INFORMS
Stable URL: http://www.jstor.org/stable/25146914
Page Count: 19
  • Download ($30.00)
  • Cite this Item
Robust Control of Markov Decision Processes with Uncertain Transition Matrices
Preview not available

Abstract

Optimal solutions to Markov decision problems may be very sensitive with respect to the state transition probabilities. In many practical problems, the estimation of these probabilities is far from accurate. Hence, estimation errors are limiting factors in applying Markov decision processes to real-world problems. We consider a robust control problem for a finite-state, finite-action Markov decision process, where uncertainty on the transition matrices is described in terms of possibly nonconvex sets. We show that perfect duality holds for this problem, and that as a consequence, it can be solved with a variant of the classical dynamic programming algorithm, the "robust dynamic programming" algorithm. We show that a particular choice of the uncertainty sets, involving likelihood regions or entropy bounds, leads to both a statistically accurate representation of uncertainty, and a complexity of the robust recursion that is almost the same as that of the classical recursion. Hence, robustness can be added at practically no extra computing cost. We derive similar results for other uncertainty sets, including one with a finite number of possible values for the transition matrices. We describe in a practical path planning example the benefits of using a robust strategy instead of the classical optimal strategy; even if the uncertainty level is only crudely guessed, the robust strategy yields a much better worst-case expected travel time.

Page Thumbnails

  • Thumbnail: Page 
780
    780
  • Thumbnail: Page 
781
    781
  • Thumbnail: Page 
782
    782
  • Thumbnail: Page 
783
    783
  • Thumbnail: Page 
784
    784
  • Thumbnail: Page 
785
    785
  • Thumbnail: Page 
786
    786
  • Thumbnail: Page 
787
    787
  • Thumbnail: Page 
788
    788
  • Thumbnail: Page 
789
    789
  • Thumbnail: Page 
790
    790
  • Thumbnail: Page 
791
    791
  • Thumbnail: Page 
792
    792
  • Thumbnail: Page 
793
    793
  • Thumbnail: Page 
794
    794
  • Thumbnail: Page 
795
    795
  • Thumbnail: Page 
796
    796
  • Thumbnail: Page 
797
    797
  • Thumbnail: Page 
798
    798