| Title:
             | 
An optimality system for finite average Markov decision chains under risk-aversion (English) | 
| Author:
             | 
Alanís-Durán, Alfredo | 
| Author:
             | 
Cavazos-Cadena, Rolando | 
| Language:
             | 
English | 
| Journal:
             | 
Kybernetika | 
| ISSN:
             | 
0023-5954 | 
| Volume:
             | 
48 | 
| Issue:
             | 
1 | 
| Year:
             | 
2012 | 
| Pages:
             | 
83-104 | 
| Summary lang:
             | 
English | 
| . | 
| Category:
             | 
math | 
| . | 
| Summary:
             | 
This work concerns controlled Markov chains with finite state space and compact action sets. The decision maker is risk-averse with constant risk-sensitivity, and the performance of a control policy is measured by the long-run average cost criterion. Under standard continuity-compactness conditions, it is shown that the (possibly non-constant) optimal value function is characterized by a system of optimality equations which allows to obtain an optimal stationary policy. Also, it is shown that the optimal superior and inferior limit average cost functions coincide. (English) | 
| Keyword:
             | 
partition of the state space | 
| Keyword:
             | 
nonconstant optimal average cost | 
| Keyword:
             | 
discounted approximations to the risk-sensitive average cost criterion | 
| Keyword:
             | 
equality of superior and inferior limit risk-averse average criteria | 
| MSC:
             | 
60J05 | 
| MSC:
             | 
93C55 | 
| MSC:
             | 
93E20 | 
| idZBL:
             | 
Zbl 1243.93127 | 
| idMR:
             | 
MR2932929 | 
| . | 
| Date available:
             | 
2012-03-05T08:31:53Z | 
| Last updated:
             | 
2013-09-22 | 
| Stable URL:
             | 
http://hdl.handle.net/10338.dmlcz/142064 | 
| . | 
| Reference:
             | 
[1] A. Arapstathis, V. K. Borkar, E. Fernández-Gaucherand, M. K. Gosh, S. I. Marcus: Discrete-time controlled Markov processes with average cost criteria: a survey..SIAM J. Control Optim. 31 (1993), 282-334. MR 1205981, 10.1137/0331018 | 
| Reference:
             | 
[2] P. Billingsley: Probability and Measure..Third edition. Wiley, New York 1995. Zbl 0822.60002, MR 1324786 | 
| Reference:
             | 
[3] R. Cavazos-Cadena, E. Fernández-Gaucherand: Controlled Markov chains with risk-sensitive criteria: average cost, optimality equations and optimal solutions..{Math. Method Optim. Res.} 43 (1999), 121-139. Zbl 0953.93077, MR 1687362 | 
| Reference:
             | 
[4] R. Cavazos-Cadena, E. Fernández-Gaucherand: Risk-sensitive control in communicating average Markov decision chains..In: { Modelling Uncertainty: An examination of Stochastic Theory, Methods and Applications} (M. Dror, P. L'Ecuyer and F. Szidarovsky, eds.), Kluwer, Boston 2002, pp. 525-544. | 
| Reference:
             | 
[5] R. Cavazos-Cadena: Solution to the risk-sensitive average cost optimality equation in a class of Markov decision processes with finite state space..{Math. Method Optim. Res.} 57 (2003), 263-285. Zbl 1023.90076, MR 1973378, 10.1007/s001860200256 | 
| Reference:
             | 
[6] R. Cavazos-Cadena, D. Hernández-Hernández: A characterization of the optimal risk-sensitive average cost in finite controlled Markov chains..{Ann. App. Probab.}, 15 (2005), 175-212. Zbl 1076.93045, MR 2115041, 10.1214/105051604000000585 | 
| Reference:
             | 
[7] R. Cavazos-Cadena, D. Hernández-Hernández: A system of Poisson equations for a non-constant Varadhan functional on a finite state space..{Appl. Math. Optim.} 53 (2006), 101-119. MR 2190228, 10.1007/s00245-005-0840-3 | 
| Reference:
             | 
[8] R. Cavazos-Cadena, F. Salem-Silva: The discounted method and equivalence of average criteria for risk-sensitive Markov decision processes on Borel spaces..{ Appl. Math. Optim.} 61 (2009), 167-190. MR 2585141, 10.1007/s00245-009-9080-2 | 
| Reference:
             | 
[9] G. B. Di Masi, L. Stettner: Risk-sensitive control of discrete time Markov processes with infinite horizon..{SIAM J. Control Optim.} 38 1999, 61-78. Zbl 0946.93043, MR 1740607, 10.1137/S0363012997320614 | 
| Reference:
             | 
[10] G. B. Di Masi, L. Stettner: Infinite horizon risk sensitive control of discrete time Markov processes with small risk..{Syst. Control Lett.} 40 (2000), 15-20. Zbl 0977.93083, MR 1829070, 10.1016/S0167-6911(99)00118-8 | 
| Reference:
             | 
[11] G. B. Di Masi, L. Stettner: Infinite horizon risk sensitive control of discrete time Markov processes under minorization property..{SIAM J. Control Optim.} 46 (2007), 231-252. Zbl 1141.93067, MR 2299627, 10.1137/040618631 | 
| Reference:
             | 
[12] W. H. Fleming, W. M. McEneany: Risk-sensitive control on an infinite horizon..{SIAM J. Control Optim.} 33 (1995), 1881-1915. MR 1358100, 10.1137/S0363012993258720 | 
| Reference:
             | 
[13] F. R. Gantmakher: The Theory of Matrices..{Chelsea}, London 1959. | 
| Reference:
             | 
[14] D. Hernández-Hernández, S. I. Marcus: Risk-sensitive control of Markov processes in countable state space..{Syst. Control Lett.} 29 (1996), 147-155. Zbl 0866.93101, MR 1422212, 10.1016/S0167-6911(96)00051-5 | 
| Reference:
             | 
[15] D. Hernández-Hernández, S. I. Marcus: Existence of risk sensitive optimal stationary policies for controlled Markov processes..{Appl. Math. Optim.} 40 (1999), 273-285. Zbl 0937.90115, MR 1709324, 10.1007/s002459900126 | 
| Reference:
             | 
[16] A. R. Howard, J. E. Matheson: Risk-sensitive Markov decision processes..{Management Sci.} 18 (1972), 356-369. Zbl 0238.90007, MR 0292497, 10.1287/mnsc.18.7.356 | 
| Reference:
             | 
[17] D. H. Jacobson: Optimal stochastic linear systems with exponential performance criteria and their relation to stochastic differential games..{IEEE Trans. Automat. Control} 18 (1973), 124-131. MR 0441523, 10.1109/TAC.1973.1100265 | 
| Reference:
             | 
[18] S. C. Jaquette: Markov decison processes with a new optimality criterion: discrete time..{Ann. Statist.} 1 (1973), 496-505. MR 0378839, 10.1214/aos/1176342415 | 
| Reference:
             | 
[19] S. C. Jaquette: A utility criterion for Markov decision processes..{Management Sci.} 23 (1976), 43-49. Zbl 0337.90053, MR 0439037, 10.1287/mnsc.23.1.43 | 
| Reference:
             | 
[20] A. Jaśkiewicz: Average optimality for risk sensitive control with general state space..{Ann. App. Probab.} 17 (2007), 654-675. Zbl 1128.93056, MR 2308338, 10.1214/105051606000000790 | 
| Reference:
             | 
[21] U. G. Rothblum, P. Whittle: Growth optimality for branching Markov decision chains..{Math. Oper. Res.} 7 (1982), 582-601. Zbl 0498.90082, MR 0686533, 10.1287/moor.7.4.582 | 
| Reference:
             | 
[22] K. Sladký: Successive approximation methods for dynamic programming models..In: Proc. Third Formator Symposium on the Analysis of Large-Scale Systems (J. Beneš and L. Bakule, eds.), Academia, Prague 1979, pp. 171-189. Zbl 0496.90081 | 
| Reference:
             | 
[23] K. Sladký: Bounds on discrete dynamic programming recursions I..{Kybernetika} 16 (1980), 526-547. Zbl 0454.90085, MR 0607292 | 
| Reference:
             | 
[24] K. Sladký: Growth rates and average optimality in risk-sensitive Markov decision chains..{Kybernetika} 44 (2008), 205-226. Zbl 1154.90612, MR 2428220 | 
| Reference:
             | 
[25] K. Sladký, R. Montes-de-Oca: Risk-sensitive average optimality in Markov decision chains..In: Operations Research Proceedings, Vol. 2007, Part III (2008), pp. 69-74. Zbl 1209.90348, 10.1007/978-3-540-77903-2_11 | 
| Reference:
             | 
[26] P. Whittle: Optimization Over Time-Dynamic Programming and Stochastic Control..Wiley, Chichester 1983. MR 0710833 | 
| Reference:
             | 
[27] W. H. M. Zijm: Nonnegative Matrices in Dynamic Programming..Mathematical Centre Tract, Amsterdam 1983. Zbl 0526.90059, MR 0723868 | 
| . |