岭回归及其在遗传研究中的应用。

Ridge regression and its applications in genetic studies.

机构信息

Department of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad, Iran.

Department of Statistics, Faculty of Mathematics, Statistics and Computer Sciences, Semnan University, Semnan, Iran.

出版信息

PLoS One. 2021 Apr 8;16(4):e0245376. doi: 10.1371/journal.pone.0245376. eCollection 2021.

DOI:10.1371/journal.pone.0245376

PMID:33831027

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8031387/

Abstract

With the advancement of technology, analysis of large-scale data of gene expression is feasible and has become very popular in the era of machine learning. This paper develops an improved ridge approach for the genome regression modeling. When multicollinearity exists in the data set with outliers, we consider a robust ridge estimator, namely the rank ridge regression estimator, for parameter estimation and prediction. On the other hand, the efficiency of the rank ridge regression estimator is highly dependent on the ridge parameter. In general, it is difficult to provide a satisfactory answer about the selection for the ridge parameter. Because of the good properties of generalized cross validation (GCV) and its simplicity, we use it to choose the optimum value of the ridge parameter. The GCV function creates a balance between the precision of the estimators and the bias caused by the ridge estimation. It behaves like an improved estimator of risk and can be used when the number of explanatory variables is larger than the sample size in high-dimensional problems. Finally, some numerical illustrations are given to support our findings.

摘要

随着技术的进步，对大规模基因表达数据的分析变得可行，并且在机器学习时代变得非常流行。本文为基因组回归建模开发了一种改进的岭方法。当数据集中存在异常值时存在多重共线性，我们考虑一种稳健的岭估计量，即秩岭回归估计量，用于参数估计和预测。另一方面，秩岭回归估计量的效率高度依赖于岭参数。一般来说，很难为岭参数的选择提供令人满意的答案。由于广义交叉验证 (GCV) 的良好特性及其简单性，我们使用它来选择岭参数的最优值。GCV 函数在估计器的精度和岭估计引起的偏差之间取得平衡。它的行为类似于风险的改进估计量，并且可以在高维问题中解释变量的数量大于样本量时使用。最后，给出了一些数值说明来支持我们的发现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ff76/8031387/6a9f66807aa1/pone.0245376.g001.jpg

相似文献

Ridge regression and its applications in genetic studies.岭回归及其在遗传研究中的应用。

PLoS One. 2021 Apr 8;16(4):e0245376. doi: 10.1371/journal.pone.0245376. eCollection 2021.

A new robust ridge parameter estimator based on search method for linear regression model.一种基于搜索方法的线性回归模型稳健岭参数估计器。

J Appl Stat. 2020 Aug 7;48(13-15):2457-2472. doi: 10.1080/02664763.2020.1803814. eCollection 2021.

New ridge parameter estimators for the quasi-Poisson ridge regression model.拟泊松岭回归模型的新岭参数估计器。

Sci Rep. 2024 Apr 11;14(1):8489. doi: 10.1038/s41598-023-50085-5.

Two-Parameter Modified Ridge-Type M-Estimator for Linear Regression Model.线性回归模型的双参数修正岭型M估计量

ScientificWorldJournal. 2020 May 15;2020:3192852. doi: 10.1155/2020/3192852. eCollection 2020.

Almost unbiased modified ridge-type estimator: An application to tourism sector data in Egypt.几乎无偏的修正岭型估计量：在埃及旅游业数据中的应用。

Heliyon. 2022 Sep 22;8(9):e10684. doi: 10.1016/j.heliyon.2022.e10684. eCollection 2022 Sep.

Bootstrap-quantile ridge estimator for linear regression with applications.带有应用的线性回归的自举分位数岭估计。

PLoS One. 2024 Apr 29;19(4):e0302221. doi: 10.1371/journal.pone.0302221. eCollection 2024.

Unbiased K-L estimator for the linear regression model.无偏 K-L 估计量在线性回归模型中的应用。

F1000Res. 2021 Aug 19;10:832. doi: 10.12688/f1000research.54990.1. eCollection 2021.

Robust regression: Testing global hypotheses about the slopes when there is multicollinearity or heteroscedasticity.稳健回归：当存在多重共线性或异方差时，检验斜率的全局假设。

Br J Math Stat Psychol. 2019 May;72(2):355-369. doi: 10.1111/bmsp.12152. Epub 2018 Nov 23.

Some Shrinkage estimators based on median ranked set sampling.一些基于中位数排序集抽样的收缩估计量。

J Appl Stat. 2021 Mar 16;48(13-15):2473-2498. doi: 10.1080/02664763.2021.1895088. eCollection 2021.

A new linearized ridge Poisson estimator in the presence of multicollinearity.存在多重共线性时的一种新的线性化岭泊松估计量。

J Appl Stat. 2021 Feb 16;49(8):2016-2034. doi: 10.1080/02664763.2021.1887103. eCollection 2022.

引用本文的文献

Integrating bioinformatics analysis, machine learning, and experimental validation to identify pyroptosis-related genes in the diagnosis of sepsis combined with acute liver failure.整合生物信息学分析、机器学习和实验验证，以鉴定脓毒症合并急性肝衰竭诊断中与细胞焦亡相关的基因。

Hereditas. 2025 Aug 8;162(1):153. doi: 10.1186/s41065-025-00522-4.

Artificial Intelligence Models in Diagnosis and Treatment of Kidney Diseases: Current Status and Prospects.人工智能模型在肾脏疾病诊断与治疗中的现状与展望

Kidney Dis (Basel). 2025 Jun 12;11(1):491-507. doi: 10.1159/000546397. eCollection 2025 Jan-Dec.

Machine learning-based integration develops relapse related signature for predicting prognosis and indicating immune microenvironment infiltration in breast cancer.基于机器学习的整合开发了与复发相关的特征，用于预测乳腺癌的预后并指示免疫微环境浸润。

Sci Rep. 2025 Jun 5;15(1):19773. doi: 10.1038/s41598-025-03423-8.

Machine Learning Enabled Multidimensional Data Utilization Through Multi-Resonance Architecture: A Pathway to Enhanced Accuracy in Biosensing.通过多共振架构实现机器学习的多维数据利用：提高生物传感准确性的途径。

ACS Omega. 2025 May 15;10(20):20713-20722. doi: 10.1021/acsomega.5c01700. eCollection 2025 May 27.

Advancing personalized, predictive, and preventive medicine in bladder cancer: a multi-omics and machine learning approach for novel prognostic modeling, immune profiling, and therapeutic target discovery.推进膀胱癌的个性化、预测性和预防性医学：一种用于新型预后建模、免疫分析和治疗靶点发现的多组学和机器学习方法。

Front Immunol. 2025 Apr 22;16:1572034. doi: 10.3389/fimmu.2025.1572034. eCollection 2025.

Early prediction of 30-day mortality in patients with surgical wound infections following cardiothoracic surgery: Development and validation of the SWICS-30 score utilizing conventional logistic regression and artificial neural network.心胸外科手术后手术伤口感染患者30天死亡率的早期预测：使用传统逻辑回归和人工神经网络开发及验证SWICS-30评分

Braz J Infect Dis. 2025 Mar-Apr;29(2):104510. doi: 10.1016/j.bjid.2025.104510. Epub 2025 Feb 21.

The role of epigenetic regulation in pancreatic ductal adenocarcinoma progression and drug response: an integrative genomic and pharmacological prognostic prediction model.表观遗传调控在胰腺导管腺癌进展和药物反应中的作用：一种整合基因组学和药理学的预后预测模型。

Front Pharmacol. 2024 Nov 21;15:1498031. doi: 10.3389/fphar.2024.1498031. eCollection 2024.

Harmonizing two measures of adaptive functioning using computational approaches: prediction of vineland adaptive behavior scales II (VABS-II) from the adaptive behavior assessment system II (ABAS-II) scores.使用计算方法协调两种适应性功能测量指标：根据适应性行为评估系统第二版（ABAS-II）分数预测文兰适应性行为量表第二版（VABS-II）

Mol Autism. 2024 Dec 3;15(1):51. doi: 10.1186/s13229-024-00630-4.

Identifying biological markers and sociodemographic factors that influence the gap between phenotypic and chronological ages.识别影响表型年龄和实际年龄差距的生物学标志物和社会人口学因素。

Inform Health Soc Care. 2024 Oct;49(3-4):162-176. doi: 10.1080/17538157.2024.2400247. Epub 2024 Sep 24.

A novel prognostic signature related to programmed cell death in osteosarcoma.一种与骨肉瘤程序性细胞死亡相关的新型预后标志物。

Front Immunol. 2024 Jul 1;15:1427661. doi: 10.3389/fimmu.2024.1427661. eCollection 2024.

本文引用的文献

Fridge: Focused fine-tuning of ridge regression for personalized predictions.冰箱：用于个性化预测的岭回归的聚焦微调。

Stat Med. 2018 Apr 15;37(8):1290-1303. doi: 10.1002/sim.7576. Epub 2018 Jan 3.

Transient expression and flux changes during a shift from high to low riboflavin production in continuous cultures of Bacillus subtilis.枯草芽孢杆菌连续培养过程中，从高核黄素产量转变为低核黄素产量期间的瞬时表达和通量变化。

Biotechnol Bioeng. 2005 Jan 20;89(2):219-32. doi: 10.1002/bit.20338.

RNA expression analysis using an antisense Bacillus subtilis genome array.使用反义枯草芽孢杆菌基因组阵列进行RNA表达分析。

J Bacteriol. 2001 Dec;183(24):7371-80. doi: 10.1128/JB.183.24.7371-7380.2001.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

岭回归及其在遗传研究中的应用。

Ridge regression and its applications in genetic studies.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献