• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在存在多重共线性和数据有限的情况下,正则化回归可以改进多元选择的估计。

Regularized regression can improve estimates of multivariate selection in the face of multicollinearity and limited data.

作者信息

Sztepanacz Jacqueline L, Houle David

机构信息

Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, ON, Canada.

Department of Biology, Florida State University, Tallahassee, FL, United States.

出版信息

Evol Lett. 2024 Jan 23;8(3):361-373. doi: 10.1093/evlett/qrad064. eCollection 2024 Jun.

DOI:10.1093/evlett/qrad064
PMID:39211358
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11358252/
Abstract

The breeder's equation, , allows us to understand how genetics (the genetic covariance matrix, ) and the vector of linear selection gradients interact to generate evolutionary trajectories. Estimation of using multiple regression of trait values on relative fitness revolutionized the way we study selection in laboratory and wild populations. However, multicollinearity, or correlation of predictors, can lead to very high variances of and covariances between elements of , posing a challenge for the interpretation of the parameter estimates. This is particularly relevant in the era of big data, where the number of predictors may approach or exceed the number of observations. A common approach to multicollinear predictors is to discard some of them, thereby losing any information that might be gained from those traits. Using simulations, we show how, on the one hand, multicollinearity can result in inaccurate estimates of selection, and, on the other, how the removal of correlated phenotypes from the analyses can provide a misguided view of the targets of selection. We show that regularized regression, which places data-validated constraints on the magnitudes of individual elements of , can produce more accurate estimates of the total strength and direction of multivariate selection in the presence of multicollinearity and limited data, and often has little cost when multicollinearity is low. We also compare standard and regularized regression estimates of selection in a reanalysis of three published case studies, showing that regularized regression can improve fitness predictions in independent data. Our results suggest that regularized regression is a valuable tool that can be used as an important complement to traditional least-squares estimates of selection. In some cases, its use can lead to improved predictions of individual fitness, and improved estimates of the total strength and direction of multivariate selection.

摘要

育种家方程 使我们能够理解遗传学(遗传协方差矩阵 )与线性选择梯度向量 如何相互作用以产生进化轨迹。通过将性状值对相对适合度进行多元回归来估计 ,彻底改变了我们在实验室和野生种群中研究选择的方式。然而,多重共线性,即预测变量之间的相关性,可能导致 的方差以及 各元素之间的协方差非常高,这给参数估计的解释带来了挑战。在大数据时代,这一问题尤为突出,因为预测变量的数量可能接近或超过观测值的数量。处理多重共线性预测变量的常见方法是舍弃其中一些变量,从而丢失可能从这些性状中获得的任何信息。通过模拟,我们展示了一方面多重共线性如何导致选择估计不准确,另一方面从分析中去除相关表型如何可能提供对选择目标的误导性观点。我们表明,正则化回归,即在 的各个元素大小上施加数据验证约束,可以在存在多重共线性和数据有限的情况下,更准确地估计多元选择的总强度和方向,而且在多重共线性较低时通常成本很小。我们还在对三个已发表的案例研究的重新分析中比较了选择的标准回归估计和正则化回归估计,结果表明正则化回归可以改善独立数据中的适合度预测。我们的结果表明,正则化回归是一种有价值的工具,可以用作传统最小二乘选择估计的重要补充。在某些情况下,使用它可以改进个体适合度的预测,并改进多元选择总强度和方向的估计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/2debdcd37137/qrad064_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/7180ba3bd94b/qrad064_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/712b8157006b/qrad064_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/3c92bc339900/qrad064_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/96e11bf4debd/qrad064_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/cd16bc591dcc/qrad064_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/2debdcd37137/qrad064_fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/7180ba3bd94b/qrad064_fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/712b8157006b/qrad064_fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/3c92bc339900/qrad064_fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/96e11bf4debd/qrad064_fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/cd16bc591dcc/qrad064_fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/24c8/11358252/2debdcd37137/qrad064_fig6.jpg

相似文献

1
Regularized regression can improve estimates of multivariate selection in the face of multicollinearity and limited data.在存在多重共线性和数据有限的情况下,正则化回归可以改进多元选择的估计。
Evol Lett. 2024 Jan 23;8(3):361-373. doi: 10.1093/evlett/qrad064. eCollection 2024 Jun.
2
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.
3
Estimating uncertainty in multivariate responses to selection.估计多变量选择响应中的不确定性。
Evolution. 2014 Apr;68(4):1188-96. doi: 10.1111/evo.12321. Epub 2013 Dec 19.
4
A method to predict the response to directional selection using a Kalman filter.利用卡尔曼滤波器预测对定向选择的反应的方法。
Proc Natl Acad Sci U S A. 2022 Jul 12;119(28):e2117916119. doi: 10.1073/pnas.2117916119. Epub 2022 Jul 6.
5
Why are estimates of the strength and direction of natural selection from wild populations not congruent with observed rates of phenotypic change?为什么从野生种群中得出的自然选择强度和方向的估计值与观察到的表型变化速率不一致?
Bioessays. 2016 Sep;38(9):927-34. doi: 10.1002/bies.201600017. Epub 2016 Jul 12.
6
An exact form of the breeder's equation for the evolution of a quantitative trait under natural selection.自然选择下数量性状进化的育种者方程的精确形式。
Evolution. 2005 Nov;59(11):2287-98.
7
Consistent Partial Least Squares Path Modeling via Regularization.基于正则化的一致性偏最小二乘路径建模
Front Psychol. 2018 Feb 19;9:174. doi: 10.3389/fpsyg.2018.00174. eCollection 2018.
8
A note on measuring natural selection on principal component scores.关于测量主成分得分上的自然选择的一则注释。
Evol Lett. 2018 Jun 21;2(4):272-280. doi: 10.1002/evl3.63. eCollection 2018 Aug.
9
Overcoming the problem of multicollinearity in sports performance data: A novel application of partial least squares correlation analysis.克服运动表现数据中的多重共线性问题:偏最小二乘相关分析的新应用。
PLoS One. 2019 Feb 14;14(2):e0211776. doi: 10.1371/journal.pone.0211776. eCollection 2019.
10
THE MEASUREMENT OF SELECTION ON QUANTITATIVE TRAITS: BIASES DUE TO ENVIRONMENTAL COVARIANCES BETWEEN TRAITS AND FITNESS.数量性状选择的测量:性状与适合度之间环境协方差导致的偏差
Evolution. 1992 Jun;46(3):616-626. doi: 10.1111/j.1558-5646.1992.tb02070.x.

引用本文的文献

1
Measuring natural selection on the transcriptome.测量转录组上的自然选择。
New Phytol. 2025 Sep;247(5):1994-2002. doi: 10.1111/nph.70287. Epub 2025 Jun 5.
2
Predicting Fitness-Related Traits Using Gene Expression and Machine Learning.利用基因表达和机器学习预测与健身相关的特征。
Genome Biol Evol. 2025 Feb 3;17(2). doi: 10.1093/gbe/evae275.
3
Predicting mental health disparities using machine learning for African Americans in Southeastern Virginia.使用机器学习预测弗吉尼亚州东南部非裔美国人的心理健康差异。

本文引用的文献

1
Genomic Perspective on Multivariate Variation, Pleiotropy, and Evolution.从基因组角度看多变量变异、多效性和进化。
J Hered. 2019 Jul 1;110(4):479-493. doi: 10.1093/jhered/esz011.
2
Conflicting selection on floral scent emission in the orchid Gymnadenia conopsea.兰花 Gymnadenia conopsea 中花香散发的冲突选择。
New Phytol. 2019 Jun;222(4):2009-2022. doi: 10.1111/nph.15747. Epub 2019 Mar 18.
3
A note on measuring natural selection on principal component scores.关于测量主成分得分上的自然选择的一则注释。
Sci Rep. 2025 Feb 18;15(1):5900. doi: 10.1038/s41598-025-89579-9.
4
Radiogenomic method combining DNA methylation profiles and magnetic resonance imaging radiomics predicts patient prognosis in skull base chordoma.结合DNA甲基化图谱和磁共振成像放射组学的放射基因组学方法可预测颅底脊索瘤患者的预后。
Clin Epigenetics. 2025 Feb 17;17(1):23. doi: 10.1186/s13148-025-01836-w.
5
Natural selection on floral volatiles and other traits can change with snowmelt timing and summer precipitation.对花香挥发物和其他性状的自然选择会随着融雪时间和夏季降水量而变化。
New Phytol. 2025 Jan;245(1):332-346. doi: 10.1111/nph.20157. Epub 2024 Sep 27.
Evol Lett. 2018 Jun 21;2(4):272-280. doi: 10.1002/evl3.63. eCollection 2018 Aug.
4
Evidence of directional and stabilizing selection in contemporary humans.当代人类中存在定向选择和稳定选择的证据。
Proc Natl Acad Sci U S A. 2018 Jan 2;115(1):151-156. doi: 10.1073/pnas.1707227114. Epub 2017 Dec 18.
5
REGRESSION ANALYSIS OF NATURAL SELECTION: STATISTICAL INFERENCE AND BIOLOGICAL INTERPRETATION.自然选择的回归分析:统计推断与生物学解释
Evolution. 1987 Nov;41(6):1149-1161. doi: 10.1111/j.1558-5646.1987.tb02457.x.
6
THE MEASUREMENT OF SELECTION ON CORRELATED CHARACTERS.对相关性状选择的度量
Evolution. 1983 Nov;37(6):1210-1226. doi: 10.1111/j.1558-5646.1983.tb00236.x.
7
Accounting for Sampling Error in Genetic Eigenvalues Using Random Matrix Theory.使用随机矩阵理论计算遗传特征值中的抽样误差。
Genetics. 2017 Jul;206(3):1271-1284. doi: 10.1534/genetics.116.198606. Epub 2017 May 5.
8
Meta-analysis of magnitudes, differences and variation in evolutionary parameters.进化参数的大小、差异和变异的荟萃分析。
J Evol Biol. 2016 Oct;29(10):1882-1904. doi: 10.1111/jeb.12950.
9
Multivariate selection and intersexual genetic constraints in a wild bird population.野生鸟类种群中的多变量选择与两性间的遗传限制
J Evol Biol. 2016 Oct;29(10):2022-2035. doi: 10.1111/jeb.12925. Epub 2016 Jul 14.
10
Pollen limitation and its influence on natural selection through seed set.花粉限制及其通过结实率对自然选择的影响。
J Evol Biol. 2015 Nov;28(11):2097-105. doi: 10.1111/jeb.12741. Epub 2015 Sep 21.