Access

You are not currently logged in.

Access your personal account or get JSTOR access through your library or other institution:

login

Log in to your personal account or through your institution.

Computationally Feasible Bounds for Partially Observed Markov Decision Processes

William S. Lovejoy
Operations Research
Vol. 39, No. 1 (Jan. - Feb., 1991), pp. 162-175
Published by: INFORMS
Stable URL: http://www.jstor.org/stable/171496
Page Count: 14
  • Download ($30.00)
  • Cite this Item
Computationally Feasible Bounds for Partially Observed Markov Decision Processes
Preview not available

Abstract

A partially observed Markov decision process (POMDP) is a sequential decision problem where information concerning parameters of interest is incomplete, and possible actions include sampling, surveying, or otherwise collecting additional information. Such problems can theoretically be solved as dynamic programs, but the relevant state space is infinite, which inhibits algorithmic solution. This paper explains how to approximate the state space by a finite grid of points, and use that grid to construct upper and lower value function bounds, generate approximate nonstationary and stationary policies, and bound the value loss relative to optimal for using these policies in the decision problem. A numerical example illustrates the methodology.

Page Thumbnails

  • Thumbnail: Page 
162
    162
  • Thumbnail: Page 
163
    163
  • Thumbnail: Page 
164
    164
  • Thumbnail: Page 
165
    165
  • Thumbnail: Page 
166
    166
  • Thumbnail: Page 
167
    167
  • Thumbnail: Page 
168
    168
  • Thumbnail: Page 
169
    169
  • Thumbnail: Page 
170
    170
  • Thumbnail: Page 
171
    171
  • Thumbnail: Page 
172
    172
  • Thumbnail: Page 
173
    173
  • Thumbnail: Page 
174
    174
  • Thumbnail: Page 
175
    175