Suppr超能文献

惩罚和非惩罚方法在人类复杂疾病遗传预测中的性能和稳健性。

Performance and robustness of penalized and unpenalized methods for genetic prediction of complex human disease.

机构信息

Medical Systems Biology, Departments of Pathology and of Microbiology & Immunology, The University of Melbourne, Parkville, VIC, Australia.

出版信息

Genet Epidemiol. 2013 Feb;37(2):184-95. doi: 10.1002/gepi.21698. Epub 2012 Nov 30.

Abstract

A central goal of medical genetics is to accurately predict complex disease from genotypes. Here, we present a comprehensive analysis of simulated and real data using lasso and elastic-net penalized support-vector machine models, a mixed-effects linear model, a polygenic score, and unpenalized logistic regression. In simulation, the sparse penalized models achieved lower false-positive rates and higher precision than the other methods for detecting causal SNPs. The common practice of prefiltering SNP lists for subsequent penalized modeling was examined and shown to substantially reduce the ability to recover the causal SNPs. Using genome-wide SNP profiles across eight complex diseases within cross-validation, lasso and elastic-net models achieved substantially better predictive ability in celiac disease, type 1 diabetes, and Crohn's disease, and had equivalent predictive ability in the rest, with the results in celiac disease strongly replicating between independent datasets. We investigated the effect of linkage disequilibrium on the predictive models, showing that the penalized methods leverage this information to their advantage, compared with methods that assume SNP independence. Our findings show that sparse penalized approaches are robust across different disease architectures, producing as good as or better phenotype predictions and variance explained. This has fundamental ramifications for the selection and future development of methods to genetically predict human disease.

摘要

医学遗传学的一个核心目标是从基因型中准确预测复杂疾病。在这里,我们使用套索和弹性网络惩罚支持向量机模型、混合效应线性模型、多基因评分和无惩罚逻辑回归,对模拟和真实数据进行了全面分析。在模拟中,稀疏惩罚模型比其他方法具有更低的假阳性率和更高的检测因果 SNP 的精度。我们还研究了 SNP 列表的常见预先过滤方法,用于后续的惩罚建模,结果表明这种方法会大大降低恢复因果 SNP 的能力。在交叉验证中,使用跨 8 种复杂疾病的全基因组 SNP 图谱,套索和弹性网络模型在乳糜泻、1 型糖尿病和克罗恩病中的预测能力有显著提高,而在其他疾病中的预测能力相当,乳糜泻的结果在独立数据集之间得到了强烈的复制。我们还研究了连锁不平衡对预测模型的影响,结果表明,与假设 SNP 独立的方法相比,惩罚方法利用了这种信息来获得优势。我们的研究结果表明,稀疏惩罚方法在不同的疾病结构中具有稳健性,能够产生与或优于表型预测和解释方差。这对人类疾病遗传预测方法的选择和未来发展具有重要意义。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验