[1] Abounadi, J., Bertsekas, D., Borkar, V. S.: 
Learning algorithms for Markov decision processes with average cost. SIAM J. Control Optim. 40 (2001), 681-698. 
DOI 10.1137/s0363012999361974 | 
MR 1871450 
[2] Aliprantis, C. D., Border, K. C.: 
Infinite Dimensional Analysis. Third edition. Springer-Verlag, Berlin 2006. 
MR 2378491 
[3] Almudevar, A.: 
Approximate fixed point iteration with an application to infinite horizon Markov decision processes. SIAM J. Control Optim. 46 (2008), 541-561. 
DOI 10.1137/040614384 | 
MR 2448464 
[4] Araposthatis, A., Borkar, V. S., Fernández-Guacherand, E., Gosh, M. K., Marcus, S. I.: 
Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J. Control Optim. 31 (1993) 282-344. 
DOI 10.1137/0331018 | 
MR 1205981 
[5] Beutel, L., Gonska, H., Kacsó, D.: 
On variation-diminishing Shoenberg operators: new quantitative statements. In: Multivariate Approximation and Interpolations with Applications (M. Gasca, ed.), Monografías de la Academia de Ciencias de Zaragoza No. 20 2002, pp. 9-58. 
MR 1966063 
[6] Bertsekas, D. P.: 
Dynamic Programming: Deterministic and Stochastic Models. Prentice-Hall, Englewood Cliffs NJ 1987. 
MR 0896902 | 
Zbl 0649.93001 
[7] Bertsekas, D. P., Tsitsiklis, J. N.: 
Neuro-Dynamic Programming. Athena Scientific, Belmont 1996. 
MR 3444832 | 
Zbl 0924.68163 
[10] Chang, H. S., Hu, J., Fu, M. C., Marcus, S. I.: 
Simulation-Based Algorithms for Markov Decision Processes. Second edition. Springer-Verlag, London 2013. 
DOI 10.1007/978-1-4471-5022-0 | 
MR 3052425 
[12] DeVore, R. A.: 
The Approximation of Continuous Functions by Positive Linear Operators. Lectures Notes in Mathematics 293. Springer-Verlag, Berlin, Heidelberg 1972. 
DOI 10.1007/bfb0059493 | 
MR 0420083 
[13] Dorea, C. C. Y., Pereira, A. G. C.: 
A note on a variations of Doeblin's condition for uniform ergodicity of Markov chains. Acta Math. Hungar. 110, Issue 4, (2006), 287-292. 
DOI 10.1007/s10474-006-0023-y | 
MR 2213230 
[15] Dufour, F., Prieto-Rumeau, T.: 
Stochastic approximations of constrained discounted Markov decision processes. J. Math. Anal. Appl. 413 (2014), 856-879. 
DOI 10.1016/j.jmaa.2013.12.016 | 
MR 3159809 
[16] Dufour, F., Prieto-Rumeau, T.: 
Approximation of average cost Markov decision processes using empirical distributions and concentration inequalities. Stochastics 87 (2015), 273-307. 
DOI 10.1080/17442508.2014.939979 | 
MR 3316812 
[17] Farias, D. P. de, Roy, B. van: 
On the existence of fixed points for approximate value iteration and temporal difference learning. J. Optim. Theory Appl. 105 (2000), 589-608. 
DOI 10.1023/a:1004641123405 | 
MR 1783879 
[18] Farias, D. P. de, Roy, B. van: Approximate linear programming for average-cots dynamic programming. In: Advances in Neural Information Processing Systems 15 (S. Becker, S. Thrun and K. Obermayer, eds.), MIT Press, Cambridge MA 2002, pp. 1587-1594.
[19] Farias, D. P. de, Roy, B. Van: 
A cost-shaping linear program for average-cost approximate dynamic programming with performance guarantees. Math. Oper. Res. 31 (2006), 597-620. 
DOI 10.1287/moor.1060.0208 | 
MR 2254426 
[20] Gordon, G. J.: 
Stable function approximation dynamic programming. In: Proc. Twelfth International Conference on Machine Learning (A. Prieditis and S. J. Russell, eds.), Tahoe City CA 1995, pp. 261-268. 
DOI 10.1016/b978-1-55860-377-6.50040-2 
[21] Gosavi, A.: 
A reinforcement learning algorithm based on policy iteration for average reward: empirical results with yield management and convergence analysis. Machine Learning 55 (2004), 5-29. 
DOI 10.1023/b:mach.0000019802.64038.6c | 
MR 2549123 
[26] Hernández-Lerma, O., Montes-de-Oca, R., Cavazos-Cadena, R.: 
Recurrence conditions for Markov decision processes with Borel spaces: a survey. Ann. Oper. Res. 29 (1991), 29-46. 
DOI 10.1007/bf02055573 | 
MR 1105165 
[28] Jaskiewicz, A., Nowak, A. S.: 
On the optimality equation for average cost Markov control processes with Feller transitions probabilities. J. Math. Anal. Appl. 316 (2006), 495-509. 
DOI 10.1016/j.jmaa.2005.04.065 | 
MR 2206685 
[29] Klein, E., Thompson, A. C.: 
Theory of Correspondences. Wiley, New York 1984. 
MR 0752692 
[31] Lee, J. M., Lee, J. H.: Approximate dynamic programming strategies and their applicability for process control: a review and future direction. Int. J. Control Automat. Systems 2 (2004), 263-278.
[33] Montes-de-Oca, R., Lemus-Rodríguez, E.: 
An unbounded Berge's minimum theorem with applications to discounted Markov decision processes. Kybernetika 48 (2012), 268-286. 
MR 2954325 | 
Zbl 1275.90124 
[36] Ortner, R.: 
Pseudometrics for state aggregation in average reward Markov decision processes. In: Algorithmic Learning Theory LNAI 4754 (M. Hutter, R. A. Serveido and E. Takimoto, eds.), Springer, Berlin, Heidelberg 2007, pp. 373-387. 
DOI 10.1007/978-3-540-75225-7_30 
[41] Powell, W. P., Ma, J.: 
A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications. J. Control Theory Appl. 9 (2011), 336-352. 
DOI 10.1007/s11768-011-0313-y | 
MR 2834000 
[42] Robles-Alcaraz, M. T., Vega-Amaya, O., Minjárez-Sosa, J. Adolfo: 
Estimate and approximate policy iteration algorithm for discounted Markov decision models with bounded costs and Borel spaces. Risk Decision Anal. 6 (2017), 79-95. 
DOI 10.3233/rda-160116 
[43] Rust, J.: 
Numerical dynamic programming in economics. In: Handbook of Computational Economics, Vol. 13 (H. Amman, D. Kendrick and J. Rust, eds.), North-Holland, Amsterdam 1996, pp. 619-728. 
DOI 10.1016/s1574-0021(96)01016-7 | 
MR 1416619 
[44] Santos, M. S.: 
Analysis of a numerical dynamic programming algorithm applied to economic models. Econometrica 66 (1998), 409-426. 
DOI 10.2307/2998564 | 
MR 1612175 
[46] Saldi, N., Yuksel, S., Linder, T.: 
Asymptotic optimality of finite approximations to Markov decision processes with Borel spaces. Math. Oper. Res. 42 (2017), 945-978. 
DOI 10.1287/moor.2016.0832 | 
MR 3722422 
[47] Stachurski, J.: 
Continuous state dynamic programming via nonexpansive approximation. Comput. Economics 31 (2008), 141-160. 
DOI 10.1007/s10614-007-9111-5 
[50] Vega-Amaya, O.: 
The average cost optimality equation: a fixed-point approach. Bol. Soc. Mat. Mexicana 9 (2003), 185-195. 
MR 1988598 
[52] Vega-Amaya, O.: 
Solutions of the average cost optimality equation for Markov decision processes with weakly continuous kernel: The fixed-point approach revisited. J. Math. Anal. Appl. 464 (2018), 152-163. 
DOI 10.1016/j.jmaa.2018.03.077 | 
MR 3794081 
[53] Vega-Amaya, O., López-Borbón, J.: 
A Perturbation approach to a class of discounted approximate value iteration algorithms with Borel spaces. J. Dyn. Games 3 (2016), 261-278. 
DOI 10.3934/jdg.2016014 | 
MR 3562878 
[54] Montes-de-Oca, O. Vega-Amaya amd R.: 
Application of average dynamic programming to inventory systems. Math. Methods Oper. Res. 47 (1998) 451-471. 
DOI 10.1007/bf01198405 | 
MR 1637569