CAGI4克罗恩病外显子组挑战：用于评估克罗恩病风险的标记单核苷酸多态性与外显子组变异模型

CAGI4 Crohn's exome challenge: Marker SNP versus exome variant models for assigning risk of Crohn disease.

作者信息

Pal Lipika R, Kundu Kunal, Yin Yizhou, Moult John

机构信息

Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland.

Computational Biology, Bioinformatics and Genomics, Biological Sciences Graduate Program, University of Maryland, College Park, Maryland.

出版信息

Hum Mutat. 2017 Sep;38(9):1225-1234. doi: 10.1002/humu.23256. Epub 2017 Jun 28.

DOI:10.1002/humu.23256

PMID:28512778

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5576730/

Abstract

Understanding the basis of complex trait disease is a fundamental problem in human genetics. The CAGI Crohn's Exome challenges are providing insight into the adequacy of current disease models by requiring participants to identify which of a set of individuals has been diagnosed with the disease, given exome data. For the CAGI4 round, we developed a method that used the genotypes from exome sequencing data only to impute the status of genome wide association studies marker SNPs. We then used the imputed genotypes as input to several machine learning methods that had been trained to predict disease status from marker SNP information. We achieved the best performance using Naïve Bayes and with a consensus machine learning method, obtaining an area under the curve of 0.72, larger than other methods used in CAGI4. We also developed a model that incorporated the contribution from rare missense variants in the exome data, but this performed less well. Future progress is expected to come from the use of whole genome data rather than exomes.

摘要

了解复杂性状疾病的基础是人类遗传学中的一个基本问题。CAGI克罗恩病外显子组挑战通过要求参与者根据外显子组数据识别一组个体中哪些已被诊断患有该疾病，为洞察当前疾病模型的充分性提供了思路。对于CAGI4轮，我们开发了一种仅使用外显子组测序数据中的基因型来推断全基因组关联研究标记单核苷酸多态性（SNP）状态的方法。然后，我们将推断出的基因型作为输入，用于几种经过训练可根据标记SNP信息预测疾病状态的机器学习方法。我们使用朴素贝叶斯方法和一种共识机器学习方法取得了最佳性能，曲线下面积为0.72，大于CAGI4中使用的其他方法。我们还开发了一个纳入外显子组数据中罕见错义变异贡献的模型，但该模型表现较差。预计未来的进展将来自全基因组数据而非外显子组的使用。

相似文献

CAGI4 Crohn's exome challenge: Marker SNP versus exome variant models for assigning risk of Crohn disease.CAGI4克罗恩病外显子组挑战：用于评估克罗恩病风险的标记单核苷酸多态性与外显子组变异模型

Hum Mutat. 2017 Sep;38(9):1225-1234. doi: 10.1002/humu.23256. Epub 2017 Jun 28.

Working toward precision medicine: Predicting phenotypes from exomes in the Critical Assessment of Genome Interpretation (CAGI) challenges.迈向精准医学：在基因组解读关键评估（CAGI）挑战中从外显子组预测表型

Hum Mutat. 2017 Sep;38(9):1182-1192. doi: 10.1002/humu.23280. Epub 2017 Jul 7.

Crohn disease risk prediction-Best practices and pitfalls with exome data.克罗恩病风险预测——外显子组数据的最佳实践与陷阱

Hum Mutat. 2017 Sep;38(9):1193-1200. doi: 10.1002/humu.23177. Epub 2017 Mar 21.

Identifying Crohn's disease signal from variome analysis.从变异组分析中识别克罗恩病信号。

Genome Med. 2019 Sep 30;11(1):59. doi: 10.1186/s13073-019-0670-6.

Performance of risk prediction for inflammatory bowel disease based on genotyping platform and genomic risk score method.基于基因分型平台和基因组风险评分方法的炎症性肠病风险预测性能

BMC Med Genet. 2017 Aug 29;18(1):94. doi: 10.1186/s12881-017-0451-2.

Multivariate genome-wide association study models to improve prediction of Crohn's disease risk and identification of potential novel variants.多变量全基因组关联研究模型可提高对克罗恩病风险的预测，并鉴定潜在的新变异。

Comput Biol Med. 2022 Jun;145:105398. doi: 10.1016/j.compbiomed.2022.105398. Epub 2022 Mar 12.

Improved risk prediction for Crohn's disease with a multi-locus approach.多基因位点方法可改善克罗恩病风险预测。

Hum Mol Genet. 2011 Jun 15;20(12):2435-42. doi: 10.1093/hmg/ddr116. Epub 2011 Mar 22.

The Unsolved Link of Genetic Markers and Crohn's Disease Progression: A North American Cohort Experience.遗传标志物与克罗恩病进展之间未解之谜：一项北美队列研究经验。

Inflamm Bowel Dis. 2019 Aug 20;25(9):1541-1549. doi: 10.1093/ibd/izz016.

Detecting identity by descent and homozygosity mapping in whole-exome sequencing data.通过全外显子组测序数据中的血缘关系和纯合性映射来检测身份。

PLoS One. 2012;7(10):e47618. doi: 10.1371/journal.pone.0047618. Epub 2012 Oct 11.

Inferring Crohn's disease association from exome sequences by integrating biological knowledge.通过整合生物学知识从外显子序列推断克罗恩病关联

BMC Med Genomics. 2016 Aug 12;9 Suppl 1(Suppl 1):35. doi: 10.1186/s12920-016-0189-2.

引用本文的文献

Digital biomarkers and artificial intelligence: a new frontier in personalized management of inflammatory bowel disease.数字生物标志物与人工智能：炎症性肠病个性化管理的新前沿。

Front Immunol. 2025 Aug 4;16:1637159. doi: 10.3389/fimmu.2025.1637159. eCollection 2025.

Inflammatory bowel disease genomics, transcriptomics, proteomics and metagenomics meet artificial intelligence.炎症性肠病基因组学、转录组学、蛋白质组学和宏基因组学与人工智能相遇。

United European Gastroenterol J. 2024 Dec;12(10):1461-1480. doi: 10.1002/ueg2.12655. Epub 2024 Aug 31.

Advances in Inflammatory Bowel Disease Diagnostics: Machine Learning and Genomic Profiling Reveal Key Biomarkers for Early Detection.炎症性肠病诊断的进展：机器学习和基因组分析揭示早期检测的关键生物标志物

Diagnostics (Basel). 2024 Jun 4;14(11):1182. doi: 10.3390/diagnostics14111182.

CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods.CAGI，即基因组解读的关键评估，旨在评估计算遗传变异解读方法的进展和前景。

Genome Biol. 2024 Feb 22;25(1):53. doi: 10.1186/s13059-023-03113-6.

Genome interpretation in a federated learning context allows the multi-center exome-based risk prediction of Crohn's disease patients.在联邦学习环境中进行基因组解读，可实现基于多中心外显子组的克罗恩病患者风险预测。

Sci Rep. 2023 Nov 9;13(1):19449. doi: 10.1038/s41598-023-46887-2.

Large sample size and nonlinear sparse models outline epistatic effects in inflammatory bowel disease.大样本量和非线性稀疏模型概述了炎症性肠病中的上位效应。

Genome Biol. 2023 Oct 5;24(1):224. doi: 10.1186/s13059-023-03064-y.

Editorial: Towards genome interpretation: Computational methods to model the genotype-phenotype relationship.社论：迈向基因组解读：用于建立基因型-表型关系模型的计算方法

Front Bioinform. 2022 Nov 30;2:1098941. doi: 10.3389/fbinf.2022.1098941. eCollection 2022.

Artificial intelligence and inflammatory bowel disease: practicalities and future prospects.人工智能与炎症性肠病：实际应用与未来前景

Frontline Gastroenterol. 2021 Dec 10;13(4):325-331. doi: 10.1136/flgastro-2021-102003. eCollection 2022.

A Systematic Review of Artificial Intelligence and Machine Learning Applications to Inflammatory Bowel Disease, with Practical Guidelines for Interpretation.人工智能和机器学习在炎症性肠病中的应用的系统评价，以及解释的实用指南。

Inflamm Bowel Dis. 2022 Oct 3;28(10):1573-1583. doi: 10.1093/ibd/izac115.

Genome interpretation using in silico predictors of variant impact.使用变异影响的计算机预测因子进行基因组解读。

Hum Genet. 2022 Oct;141(10):1549-1577. doi: 10.1007/s00439-022-02457-6. Epub 2022 Apr 30.

本文引用的文献

Analysis of protein-coding genetic variation in 60,706 humans.对60706名人类的蛋白质编码基因变异进行分析。

Nature. 2016 Aug 18;536(7616):285-91. doi: 10.1038/nature19057.

Consensus Genome-Wide Expression Quantitative Trait Loci and Their Relationship with Human Complex Trait Disease.全基因组表达数量性状位点共识及其与人类复杂性状疾病的关系。

OMICS. 2016 Jul;20(7):400-14. doi: 10.1089/omi.2016.0063.

Genetics of inflammatory bowel disease from multifactorial to monogenic forms.炎症性肠病从多因素形式到单基因形式的遗传学

World J Gastroenterol. 2015 Nov 21;21(43):12296-310. doi: 10.3748/wjg.v21.i43.12296.

Inherited determinants of Crohn's disease and ulcerative colitis phenotypes: a genetic association study.克罗恩病和溃疡性结肠炎表型的遗传决定因素：一项基因关联研究。

Lancet. 2016 Jan 9;387(10014):156-67. doi: 10.1016/S0140-6736(15)00465-1. Epub 2015 Oct 18.

A global reference for human genetic variation.人类遗传变异的全球参考。

Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.

Insights from GWAS: emerging landscape of mechanisms underlying complex trait disease.全基因组关联研究的见解：复杂性状疾病潜在机制的新图景

BMC Genomics. 2015;16 Suppl 8(Suppl 8):S4. doi: 10.1186/1471-2164-16-S8-S4. Epub 2015 Jun 18.

Dissecting Allele Architecture of Early Onset IBD Using High-Density Genotyping.利用高密度基因分型剖析早发性炎症性肠病的等位基因结构

PLoS One. 2015 Jun 22;10(6):e0128074. doi: 10.1371/journal.pone.0128074. eCollection 2015.

Human genomics. The human transcriptome across tissues and individuals.人类基因组学。跨组织和个体的人类转录组。

Science. 2015 May 8;348(6235):660-5. doi: 10.1126/science.aaa0355.

Genetic Basis of Common Human Disease: Insight into the Role of Missense SNPs from Genome-Wide Association Studies.常见人类疾病的遗传基础：全基因组关联研究对错义单核苷酸多态性作用的洞察

J Mol Biol. 2015 Jul 3;427(13):2271-89. doi: 10.1016/j.jmb.2015.04.014. Epub 2015 May 1.

Heritability in inflammatory bowel disease: from the first twin study to genome-wide association studies.炎症性肠病的遗传力：从首例双胞胎研究到全基因组关联研究

Inflamm Bowel Dis. 2015 Jun;21(6):1428-34. doi: 10.1097/MIB.0000000000000393.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验