| Title:
|
The risk probability optimal problem for infinite discounted semi-Markov decision processes (English) |
| Author:
|
Wen, Xian |
| Author:
|
Cui, Jinhua |
| Author:
|
Huo, Haifeng |
| Language:
|
English |
| Journal:
|
Kybernetika |
| ISSN:
|
0023-5954 (print) |
| ISSN:
|
1805-949X (online) |
| Volume:
|
61 |
| Issue:
|
4 |
| Year:
|
2025 |
| Pages:
|
447-466 |
| Summary lang:
|
English |
| . |
| Category:
|
math |
| . |
| Summary:
|
This paper investigates the risk probability minimization problem for infinite horizon semi-Markov decision processes (SMDPs) with varying discount factors. First, we establish the standard regularity condition to guarantee the state process is non-explosive. Furthermore, based only on the non-explosion of the state process, we use value iteration technique to establish the optimality equation satisfied by the value function, and prove the uniqueness of the solution and the existence of the risk probability optimal policy. Our condition is weaker than the first arrival condition commonly used in existing literature. Finally, we develop a value iteration algorithm to compute the value function and optimal policy, and illustrate the feasibility and effectiveness of the algorithm through a numerical example. (English) |
| Keyword:
|
risk probability criterion |
| Keyword:
|
semi-Markov decision processes |
| Keyword:
|
value function |
| Keyword:
|
optimal policy |
| Keyword:
|
value iteration algorithm |
| MSC:
|
60E20 |
| MSC:
|
90C40 |
| DOI:
|
10.14736/kyb-2025-4-0447 |
| . |
| Date available:
|
2025-09-19T12:39:14Z |
| Last updated:
|
2025-09-19 |
| Stable URL:
|
http://hdl.handle.net/10338.dmlcz/153068 |
| . |
| Reference:
|
[1] Bertsekas, D., Shreve, S. E.: Stochastic Optimal Control: The Discrete-Time Case..Academic Press Inc, New York 1996. |
| Reference:
|
[2] Feinberg, E. A.: Continuous time discounted jump Markov decision processes: a discrete-event approach..Math. Oper. Res. 29 (2004), 492-524. |
| Reference:
|
[3] Bäuerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance..Springer, Heidelberg 2011. Zbl 1236.90004 |
| Reference:
|
[4] Guo, X. P., Piunovskiy, A.: Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates..Math. Oper. Res. 36 (2011), 105-132. |
| Reference:
|
[5] Guo, X. P., Hernández-Lerma, O.: Continuous-time markov decision processes: theory and applications..Springer-Verlag, Berlin 2009. |
| Reference:
|
[6] Guo, X. P., Song, X. Y., Zhang, Y.: First passage optimality for continuous-time Markov decision processes with varying discount factors and history-dependent policies..IEEE Trans. Automat. Control 59 (2013), 163-174. |
| Reference:
|
[7] Hernández-Lerma, O., Lasserre, J. B.: Discrete-time Markov control processes: basic optimality criteria..Springer-Verlag, New York 1996. |
| Reference:
|
[8] Huang, Y. H., Guo, X. P.: Optimal risk probability for first passage models in semi-Markov decision processes..J. Math. Anal. Appl. 359 (2009), 404-420. |
| Reference:
|
[9] Huang, Y. H., Guo, X. P.: Finite horizon semi-Markov decision processes with application to maintenance systems..European J. Oper. Res. 212 (2011), 131-140. |
| Reference:
|
[10] Huang, Y. H., Guo, X. P.: First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs..Acta. Math. Appl. Sinica 27 (2011), 177-190. Zbl 1235.90177, |
| Reference:
|
[11] Huang, Y. H., Guo, X. P., Li, Z. F.: Minimum risk probability for finite horizon semi-Markov decision processes..J. Math. Anal. Appl. 402 (2013), 378-391. |
| Reference:
|
[12] Huang, X. X., Zuo, X. L., Guo, X. P.: A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates..Sci. China Math. 58 (2015), 1923-1938. |
| Reference:
|
[13] Huo, H. F., Zuo, X. L., Guo, X. P.: The risk probability criterion for discounted continuous-time Markov decision processes..Discrete Event Dyn. S. 27 (2017), 675-699. |
| Reference:
|
[14] Huo, H. F., Guo, X. P.: Risk probability minimization problems for continuous-time Markov decision processes on finite horizon..IEEE Trans. Autom. Control 65 (2019), 3199-3206. |
| Reference:
|
[15] Janssen, J., Manca, R.: Semi-Markov Risk Models For Finance, Insurance, and Reliability..Springer, New York 2006. |
| Reference:
|
[16] Lin, Y. L., Tomkins, R. J., Wang, C. L.: Optimal models for the first arrival time distribution function in continuous time With a special case..Acta. Math. Appl. Sinica 10 (1994), 194-212. 10.1007/BF02006119 |
| Reference:
|
[17] Mamer, J. W.: Successive approximations for finite horizon, semi-Markov decision processes with application to asset liquidation..Oper. Res. 34 (1986), 638-644. |
| Reference:
|
[18] Sakaguchi, M., Ohtsubo, Y.: Optimal threshold probability and expectation in semi-Markov decision processes..Appl. Math. Comput. 216 (2010), 2947-2958. |
| Reference:
|
[19] Sobel, M. J.: The variance of discounted Markov decision processes..J. Appl. Probab. 19 (1982), 794-802. Zbl 0503.90091, |
| Reference:
|
[20] Nollau, V.: Solution of a discounted semi-markovian descision problem by successive oevarrelaxation..Optimization 39 (1997), 85-97. |
| Reference:
|
[21] Ohtsubo, Y.: Optimal threshold probability in undiscounted Markov decision processes with a target set..Appl. Math. Anal. Comp. 149 (2004), 519-532. MR 2033087, |
| Reference:
|
[22] Piunovskiy, A., Zhang, Y., Shiryaev, A. N.: Continuous-Time Markov Decision Processes: Borel Space Models and General Control Strategies..Springer, Berlin 2020. |
| Reference:
|
[23] White, D. J.: Minimizing a threshold probability in discounted Markov decision processes..J. Math. Anal. Appl. 173 (1993), 634-646. |
| Reference:
|
[24] Wen, X., Huo, H. F., Guo, X.P.: First passage risk probability minimization for piecewise deterministic Markov decision processes..Acta Math.Appl.Sin.Engl.Ser. 38 (2022), 549-567. |
| Reference:
|
[25] Wu, C., Lin, Y.: Minimizing risk models in Markov decision processes with policies depending on target values..J. Math. Anal. Appl. 231 (1999), 47-67. |
| Reference:
|
[26] Wu, X., Guo, X. P.: First passage optimality and variance minimisation of Markov decision processes with varying discount factors..J. Appl. Prob. 52 (2015), 441-456. Zbl 1327.90374, |
| . |