Suppr超能文献

高维稀疏 vine Copula 回归及其在基因组预测中的应用。

High-dimensional sparse vine copula regression with application to genomic prediction.

机构信息

Department of Mathematics, Technical University of Munich, Boltzmannstraße 3, 85748 Garching, Germany.

Delft Institute of Applied Mathematics, Delft University of Technology, Mekelweg 4, 2628 CD, Delft, The Netherlands.

出版信息

Biometrics. 2024 Jan 29;80(1). doi: 10.1093/biomtc/ujad042.

Abstract

High-dimensional data sets are often available in genome-enabled predictions. Such data sets include nonlinear relationships with complex dependence structures. For such situations, vine copula-based (quantile) regression is an important tool. However, the current vine copula-based regression approaches do not scale up to high and ultra-high dimensions. To perform high-dimensional sparse vine copula-based regression, we propose 2 methods. First, we show their superiority regarding computational complexity over the existing methods. Second, we define relevant, irrelevant, and redundant explanatory variables for quantile regression. Then, we show our method's power in selecting relevant variables and prediction accuracy in high-dimensional sparse data sets via simulation studies. Next, we apply the proposed methods to the high-dimensional real data, aiming at the genomic prediction of maize traits. Some data processing and feature extraction steps for the real data are further discussed. Finally, we show the advantage of our methods over linear models and quantile regression forests in simulation studies and real data applications.

摘要

高维数据集在基因组预测中经常出现。这些数据集包括具有复杂依赖结构的非线性关系。对于这种情况,基于藤蔓 copula 的(分位数)回归是一个重要的工具。然而,目前基于藤蔓 copula 的回归方法无法扩展到高维和超高维。为了进行高维稀疏藤蔓 copula 回归,我们提出了 2 种方法。首先,我们展示了它们在计算复杂度方面相对于现有方法的优越性。其次,我们为分位数回归定义了相关、不相关和冗余解释变量。然后,我们通过模拟研究展示了我们的方法在选择相关变量和高维稀疏数据集的预测准确性方面的能力。接下来,我们将提出的方法应用于高维真实数据,旨在对玉米性状进行基因组预测。进一步讨论了真实数据的一些数据处理和特征提取步骤。最后,我们在模拟研究和真实数据应用中展示了我们的方法相对于线性模型和分位数回归森林的优势。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验