The risk probability optimal problem for infinite discounted semi-Markov decision processes

Wen, Xian; Cui, Jinhua; Huo, Haifeng

About DML-CZ | FAQ | Conditions of Use | Math Archives | Contact Us

Previous | Up | Next

Article

Title:	The risk probability optimal problem for infinite discounted semi-Markov decision processes (English)
Author:	Wen, Xian
Author:	Cui, Jinhua
Author:	Huo, Haifeng
Language:	English
Journal:	Kybernetika
ISSN:	0023-5954 (print)
ISSN:	1805-949X (online)
Volume:	61
Issue:	4
Year:	2025
Pages:	447-466
Summary lang:	English
.
Category:	math
.
Summary:	This paper investigates the risk probability minimization problem for infinite horizon semi-Markov decision processes (SMDPs) with varying discount factors. First, we establish the standard regularity condition to guarantee the state process is non-explosive. Furthermore, based only on the non-explosion of the state process, we use value iteration technique to establish the optimality equation satisfied by the value function, and prove the uniqueness of the solution and the existence of the risk probability optimal policy. Our condition is weaker than the first arrival condition commonly used in existing literature. Finally, we develop a value iteration algorithm to compute the value function and optimal policy, and illustrate the feasibility and effectiveness of the algorithm through a numerical example. (English)
Keyword:	risk probability criterion
Keyword:	semi-Markov decision processes
Keyword:	value function
Keyword:	optimal policy
Keyword:	value iteration algorithm
MSC:	60E20
MSC:	90C40
DOI:	10.14736/kyb-2025-4-0447
.
Date available:	2025-09-19T12:39:14Z
Last updated:	2025-09-19
Stable URL:	http://hdl.handle.net/10338.dmlcz/153068
.
Reference:	[1] Bertsekas, D., Shreve, S. E.: Stochastic Optimal Control: The Discrete-Time Case..Academic Press Inc, New York 1996.
Reference:	[2] Feinberg, E. A.: Continuous time discounted jump Markov decision processes: a discrete-event approach..Math. Oper. Res. 29 (2004), 492-524.
Reference:	[3] Bäuerle, N., Rieder, U.: Markov Decision Processes with Applications to Finance..Springer, Heidelberg 2011. Zbl 1236.90004
Reference:	[4] Guo, X. P., Piunovskiy, A.: Discounted continuous-time Markov decision processes with constraints: unbounded transition and loss rates..Math. Oper. Res. 36 (2011), 105-132.
Reference:	[5] Guo, X. P., Hernández-Lerma, O.: Continuous-time markov decision processes: theory and applications..Springer-Verlag, Berlin 2009.
Reference:	[6] Guo, X. P., Song, X. Y., Zhang, Y.: First passage optimality for continuous-time Markov decision processes with varying discount factors and history-dependent policies..IEEE Trans. Automat. Control 59 (2013), 163-174.
Reference:	[7] Hernández-Lerma, O., Lasserre, J. B.: Discrete-time Markov control processes: basic optimality criteria..Springer-Verlag, New York 1996.
Reference:	[8] Huang, Y. H., Guo, X. P.: Optimal risk probability for first passage models in semi-Markov decision processes..J. Math. Anal. Appl. 359 (2009), 404-420.
Reference:	[9] Huang, Y. H., Guo, X. P.: Finite horizon semi-Markov decision processes with application to maintenance systems..European J. Oper. Res. 212 (2011), 131-140.
Reference:	[10] Huang, Y. H., Guo, X. P.: First passage models for denumerable semi-Markov decision processes with nonnegative discounted costs..Acta. Math. Appl. Sinica 27 (2011), 177-190. Zbl 1235.90177,
Reference:	[11] Huang, Y. H., Guo, X. P., Li, Z. F.: Minimum risk probability for finite horizon semi-Markov decision processes..J. Math. Anal. Appl. 402 (2013), 378-391.
Reference:	[12] Huang, X. X., Zuo, X. L., Guo, X. P.: A minimization problem of the risk probability in first passage semi-Markov decision processes with loss rates..Sci. China Math. 58 (2015), 1923-1938.
Reference:	[13] Huo, H. F., Zuo, X. L., Guo, X. P.: The risk probability criterion for discounted continuous-time Markov decision processes..Discrete Event Dyn. S. 27 (2017), 675-699.
Reference:	[14] Huo, H. F., Guo, X. P.: Risk probability minimization problems for continuous-time Markov decision processes on finite horizon..IEEE Trans. Autom. Control 65 (2019), 3199-3206.
Reference:	[15] Janssen, J., Manca, R.: Semi-Markov Risk Models For Finance, Insurance, and Reliability..Springer, New York 2006.
Reference:	[16] Lin, Y. L., Tomkins, R. J., Wang, C. L.: Optimal models for the first arrival time distribution function in continuous time With a special case..Acta. Math. Appl. Sinica 10 (1994), 194-212. 10.1007/BF02006119
Reference:	[17] Mamer, J. W.: Successive approximations for finite horizon, semi-Markov decision processes with application to asset liquidation..Oper. Res. 34 (1986), 638-644.
Reference:	[18] Sakaguchi, M., Ohtsubo, Y.: Optimal threshold probability and expectation in semi-Markov decision processes..Appl. Math. Comput. 216 (2010), 2947-2958.
Reference:	[19] Sobel, M. J.: The variance of discounted Markov decision processes..J. Appl. Probab. 19 (1982), 794-802. Zbl 0503.90091,
Reference:	[20] Nollau, V.: Solution of a discounted semi-markovian descision problem by successive oevarrelaxation..Optimization 39 (1997), 85-97.
Reference:	[21] Ohtsubo, Y.: Optimal threshold probability in undiscounted Markov decision processes with a target set..Appl. Math. Anal. Comp. 149 (2004), 519-532. MR 2033087,
Reference:	[22] Piunovskiy, A., Zhang, Y., Shiryaev, A. N.: Continuous-Time Markov Decision Processes: Borel Space Models and General Control Strategies..Springer, Berlin 2020.
Reference:	[23] White, D. J.: Minimizing a threshold probability in discounted Markov decision processes..J. Math. Anal. Appl. 173 (1993), 634-646.
Reference:	[24] Wen, X., Huo, H. F., Guo, X.P.: First passage risk probability minimization for piecewise deterministic Markov decision processes..Acta Math.Appl.Sin.Engl.Ser. 38 (2022), 549-567.
Reference:	[25] Wu, C., Lin, Y.: Minimizing risk models in Markov decision processes with policies depending on target values..J. Math. Anal. Appl. 231 (1999), 47-67.
Reference:	[26] Wu, X., Guo, X. P.: First passage optimality and variance minimisation of Markov decision processes with varying discount factors..J. Appl. Prob. 52 (2015), 441-456. Zbl 1327.90374,
.

Files

Files	Size	Format	View
Kybernetika_61-2025-4_1.pdf	531.8Kb	application/pdf	View/Open

Back to standard record

Browse
- Collections
- Titles
- Authors
- MSC

About DML-CZ

Partner of

Article

Files

Search

Browse