Full entry |
PDF
(0.3 MB)
Feedback

Markov decision chains; second order optimality; optimality conditions for transient; discounted and average models; policy iterations; value iterations

References:

[1] Feinberg, E. A., Fei, J.: **Inequalities for variances of total discounted costs**. J. Appl. Probab. 46 (2009), 1209-1212. DOI 10.1239/jap/1261670699 | MR 2582716

[2] Gantmakher, F. R.: **The Theory of Matrices**. Chelsea, London 1959. MR 0107649

[3] Jaquette, S. C.: **Markov decision processes with a new optimality criterion: Discrete time**. Ann. Statist. 1 (1973), 496-505. DOI | MR 0378839

[4] Mandl, P.: **On the variance in controlled Markov chains**. Kybernetika 7 (1971), 1-12. MR 0286178 | Zbl 0215.25902

[5] Markowitz, H.: **Portfolio Selection - Efficient Diversification of Investments**. Wiley, New York 1959. MR 0103768

[6] Puterman, M. L.: **Markov Decision Processes - Discrete Stochastic Dynamic Programming**. Wiley, New York 1994. MR 1270015

[7] Bäuerle, N., Rieder, U.: **Markov Decision Processes with Application to Finance**. Springer-Verlag, Berlin 2011. MR 2808878

[8] Righter, R.: **Stochastic comparison of discounted rewards**. J. Appl. Probab. 48 (2011), 293-294. DOI | MR 2809902

[9] Sladký, K.: **On mean reward variance in semi-Markov processes**. Math. Meth. Oper. Res. 62 (2005), 387-397. DOI | MR 2229697

[10] Sladký, K.: **Risk-sensitive and mean variance optimality in Markov decision processes**. Acta Oeconomica Pragensia 7 (2013), 146-161.

[11] Sladký, K.: **Second order optimality in transient and discounted Markov decision chains**. In: Proc. 33th Internat. Conf. Math. Methods in Economics MME 2015 (D. Martinčík, ed.), University of West Bohemia, Plzeň 2015, pp. 731-736.

[12] Sobel, M.: **The variance of discounted Markov decision processes**. J. Appl. Probab. 19 (1982), 794-802. DOI | MR 0675143 | Zbl 0503.90091

[13] Dijk, N. M. Van, Sladký, K.: **On the total reward variance for continuous-time Markov reward chains**. J. Appl. Probab. 43 (2006), 1044-1052. DOI | MR 2274635

[14] Veinott, A. F., Jr: **Discrete dynamic programming with sensitive discount optimality criteria**. Ann. Math. Statist. 13 (1969), 1635-1660. DOI | MR 0256712

[15] White, D. J.: **Mean, variance and probability criteria in finite Markov decision processes: A review**. J. Optimizat. Th. Appl. 56 (1988), 1-29. DOI | MR 0922375

[16] Wu, X., Guo, X.: **First passage optimality and variance minimisation of Markov decision processes with varying discount factors**. J. Appl. Probab. 52 (2015), 441-456. DOI | MR 3372085 | Zbl 1327.90374