Previous |  Up |  Next

Article

Title: Ridge estimation of covariance matrix from data in two classes (English)
Author: Zhou, Yi
Author: Zhang, Bin
Language: English
Journal: Applications of Mathematics
ISSN: 0862-7940 (print)
ISSN: 1572-9109 (online)
Volume: 69
Issue: 2
Year: 2024
Pages: 169-184
Summary lang: English
.
Category: math
.
Summary: This paper deals with the problem of estimating a covariance matrix from the data in two classes: (1) good data with the covariance matrix of interest and (2) contamination coming from a Gaussian distribution with a different covariance matrix. The ridge penalty is introduced to address the problem of high-dimensional challenges in estimating the covariance matrix from the two-class data model. A ridge estimator of the covariance matrix has a uniform expression and keeps positive-definite, whether the data size is larger or smaller than the data dimension. Furthermore, the ridge parameter is tuned through a cross-validation procedure. Lastly, the proposed ridge estimator is verified with better performance than the existing estimator from the data in two classes and the traditional ridge estimator only from the good data. (English)
Keyword: covariance matrix
Keyword: ridge estimation
Keyword: two-class data
Keyword: contamination
MSC: 62H12
MSC: 62J07
DOI: 10.21136/AM.2024.0157-23
.
Date available: 2024-04-04T12:07:18Z
Last updated: 2024-04-04
Stable URL: http://hdl.handle.net/10338.dmlcz/152311
.
Reference: [1] Ahsanullah, M., Nevzorov, V. B.: Generalized spacings of order statistics from extended sample.J. Stat. Plann. Inference 85 (2000), 75-83. Zbl 0968.62017, MR 1759240, 10.1016/S0378-3758(99)00067-1
Reference: [2] Besson, O.: Maximum likelihood covariance matrix estimation from two possibly mismatched data sets.Signal Process. 167 (2020), Article ID 107285, 9 pages. 10.1016/j.sigpro.2019.107285
Reference: [3] Bhatia, R.: Positive Definite Matrices.Princeton Series in Applied Mathematics. Princeton University Press, Princeton (2007). Zbl 1133.15017, MR 3443454, 10.1515/9781400827787
Reference: [4] Bien, J., Tibshirani, R. J.: Sparse estimation of a covariance matrix.Biometrika 98 (2011), 807-820. Zbl 1228.62063, MR 2860325, 10.1093/biomet/asr054
Reference: [5] Bodnar, O., Bodnar, T., Parolya, N.: Recent advances in shrinkage-based high-dimensional inference.J. Multivariate Anal. 188 (2022), Article ID 104826, 13 pages. Zbl 1493.62298, MR 4353848, 10.1016/j.jmva.2021.104826
Reference: [6] Cho, S., Katayama, S., Lim, J., Choi, Y.-G.: Positive-definite modification of a covariance matrix by minimizing the matrix $\ell_{\infty}$ norm with applications to portfolio optimization.AStA, Adv. Stat. Anal. 105 (2021), 601-627. Zbl 1478.62118, MR 4340896, 10.1007/s10182-021-00396-7
Reference: [7] Danaher, P., Wang, P., Witten, D. M.: The joint graphical lasso for inverse covariance estimation across multiple classes.J. R. Stat. Soc., Ser. B, Stat. Methodol. 76 (2014), 373-397. Zbl 07555455, MR 3164871, 10.1111/rssb.12033
Reference: [8] Fisher, T. J., Sun, X.: Improved Stein-type shrinkage estimators for the high-dimensional multivariate normal covariance matrix.Comput. Stat. Data Anal. 55 (2011), 1909-1918. Zbl 1328.62336, MR 2765053, 10.1016/j.csda.2010.12.006
Reference: [9] Götze, F., Tikhomirov, A.: Rate of convergence in probability to the Marchenko-Pastur law.Bernoulli 10 (2004), 503-548. Zbl 1049.60018, MR 2061442, 10.3150/bj/1089206408
Reference: [10] Hannart, A., Naveau, P.: Estimating high dimensional covariance matrices: A new look at the Gaussian conjugate framework.J. Multivariate Anal. 131 (2014), 149-162. Zbl 1306.62120, MR 3252641, 10.1016/j.jmva.2014.06.001
Reference: [11] Hoshino, N., Takemura, A.: On reduction of finite-sample variance by extended Latin hypercube sampling.Bernoulli 6 (2000), 1035-1050. Zbl 0979.65005, MR 1809734, 10.2307/3318470
Reference: [12] Huang, C., Farewell, D., Pan, J.: A calibration method for non-positive definite covariance matrix in multivariate data analysis.J. Multivariate Anal. 157 (2017), 45-52. Zbl 1362.62136, MR 3641735, 10.1016/j.jmva.2017.03.001
Reference: [13] Huang, J. Z., Liu, N., Pourahmadi, M., Liu, L.: Covariance matrix selection and estimation via penalised normal likelihood.Biometrika 93 (2006), 85-98. Zbl 1152.62346, MR 2277742, 10.1093/biomet/93.1.85
Reference: [14] Jia, S., Zhang, C., Lu, H.: Covariance function versus covariance matrix estimation in efficient semi-parametric regression for longitudinal data analysis.J. Multivariate Anal. 187 (2022), Article ID 104900, 14 pages. Zbl 1480.62098, MR 4339021, 10.1016/j.jmva.2021.104900
Reference: [15] Kalina, J., Tebbens, J. D.: Algorithms for regularized linear discriminant analysis.Proceedings of the International Conference on Bioinformatics Models, Methods and Algorithms Scitepress, Setúbal (2015), 128-133. 10.5220/0005234901280133
Reference: [16] Kochan, N., Tütüncü, G. Y., Giner, G.: A new local covariance matrix estimation for the classification of gene expression profiles in high dimensional RNA-Seq data.Expert Systems Appl. 167 (2021), Article ID 114200, 5 pages. 10.1016/j.eswa.2020.114200
Reference: [17] Le, C. M., Levin, K., Bickel, P. J., Levina, E.: Comment: Ridge regression and regularization of large matrices.Technometrics 62 (2020), 443-446. MR 4165992, 10.1080/00401706.2020.1796815
Reference: [18] Ledoit, O., Wolf, M.: A well-conditioned estimator for large-dimensional covariance matrices.J. Multivariate Anal. 88 (2004), 365-411. Zbl 1032.62050, MR 2026339, 10.1016/S0047-259X(03)00096-4
Reference: [19] Li, C.-N., Ren, P.-W., Guo, Y.-R., Ye, Y.-F., Shao, Y.-H.: Regularized linear discriminant analysis based on generalized capped $\ell_{2,q}$-norm.(to appear) in Ann. Oper. Res. 10.1007/s10479-022-04959-y
Reference: [20] Lim, L.-H., Sepulchre, R., Ye, K.: Geometric distance between positive definite matrices of different dimensions.IEEE Trans. Inf. Theory 65 (2019), 5401-5405. Zbl 1432.15033, MR 4009241, 10.1109/TIT.2019.2913874
Reference: [21] Massignan, J. A. D., London, J. B. A., Bessani, M., Maciel, C. D., Fannucchi, R. Z., Miranda, V.: Bayesian inference approach for information fusion in distribution system state estimation.IEEE Trans. Smart Grid 13 (2022), 526-540. 10.1109/TSG.2021.3128053
Reference: [22] Mestre, X.: On the asymptotic behavior of the sample estimates of eigenvalues and eigenvectors of covariance matrices.IEEE Trans. Signal Process. 56 (2008), 5353-5368. Zbl 1391.62092, MR 2472837, 10.1109/TSP.2008.929662
Reference: [23] Raninen, E., Ollila, E.: Coupled regularized sample covariance matrix estimator for multiple classes.IEEE Trans. Signal Process. 69 (2021), 5681-5692. MR 4332948, 10.1109/TSP.2021.3118546
Reference: [24] Raninen, E., Tyler, D. E., Ollila, E.: Linear pooling of sample covariance matrices.IEEE Trans. Signal Process. 70 (2022), 659-672. MR 4381805, 10.1109/TSP.2021.3139207
Reference: [25] Scheidegger, C., Hörrmann, J., Bühlmann, P.: The weighted generalised covariance measure.J. Mach. Learn. Res. 23 (2022), Article ID 273, 68 pages. MR 4577712
Reference: [26] Tsukuma, H., Kubokawa, T.: Unified improvements in estimation of a normal covariance matrix in high and low dimensions.J. Multivariate Anal. 143 (2016), 233-248. Zbl 1328.62348, MR 3431430, 10.1016/j.jmva.2015.09.016
Reference: [27] Wieringen, W. N. van, Peeters, C. F. W.: Ridge estimation of inverse covariance matrices from high-dimensional data.Comput. Stat. Data Anal. 103 (2016), 284-303. Zbl 1466.62204, MR 3522633, 10.1016/j.csda.2016.05.012
Reference: [28] Vershynin, R.: How close is the sample covariance matrix to the actual covariance matrix?.J. Theor. Probab. 25 (2012), 655-686. Zbl 1365.62208, MR 2956207, 10.1007/s10959-010-0338-z
Reference: [29] Wang, H., Peng, B., Li, D., Leng, C.: Nonparametric estimation of large covariance matrices with conditional sparsity.J. Econom. 223 (2021), 53-72. Zbl 1471.62378, MR 4252147, 10.1016/j.jeconom.2020.09.002
Reference: [30] Warton, D. I.: Penalized normal likelihood and ridge regularization of correlation and covariance matrices.J. Am. Stat. Assoc. 103 (2008), 340-349. Zbl 1471.62362, MR 2394637, 10.1198/016214508000000021
Reference: [31] Witten, D. M., Tibshirani, R.: Covariance-regularized regression and classification for high dimensional problems.J. R. Stat. Soc., Ser. B, Stat. Methodol. 71 (2009), 615-636. Zbl 1250.62033, MR 2749910, 10.1111/j.1467-9868.2009.00699.x
Reference: [32] Xi, B., Li, J., Li, Y., Song, R., Hong, D., Chanussot, J.: Few-shot learning with class-co-variance metric for hyperspectral image classification.IEEE Trans. Image Process. 31 (2022), 5079-5092. 10.1109/TIP.2022.3192712
Reference: [33] Xue, L., Ma, S., Zou, H.: Positive-definite $\ell_1$-penalized estimation of large covariance matrices.J. Am. Stat. Assoc. 107 (2012), 1480-1491. Zbl 1258.62063, MR 3036409, 10.1080/01621459.2012.725386
Reference: [34] Yang, Y., Zhou, J., Pan, J.: Estimation and optimal structure selection of high-dimensional Toeplitz covariance matrix.J. Multivariate Anal. 184 (2021), Article ID 104739, 17 pages. Zbl 1467.62095, MR 4236460, 10.1016/j.jmva.2021.104739
Reference: [35] Yin, Y.: Spectral statistics of high dimensional sample covariance matrix with unbounded population spectral norm.Bernoulli 28 (2022), 1729-1756. Zbl 07526604, MR 4411509, 10.3150/21-BEJ1391
Reference: [36] Yuasa, R., Kubokawa, T.: Ridge-type linear shrinkage estimation of the mean matrix of a high-dimensional normal distribution.J. Multivariate Anal. 178 (2020), Article ID 104608, 18 pages. Zbl 1440.62036, MR 4079038, 10.1016/j.jmva.2020.104608
Reference: [37] Zhang, H., Jia, J.: Elastic-net regularized high-dimensional negative binomial regression: Consistency and weak signals detection.Stat. Sin. 32 (2022), 181-207. Zbl 07484115, MR 4359629, 10.5705/ss.202019.0315
Reference: [38] Zhang, Y., Zhou, Y., Liu, X.: Applications on linear spectral statistics of high-dimensional sample covariance matrix with divergent spectrum.Comput. Stat. Data Anal. 178 (2023), Article ID 107617, 19 pages. Zbl 07626679, MR 4483317, 10.1016/j.csda.2022.107617
.

Fulltext not available (moving wall 24 months)

Partner of
EuDML logo