• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Mantis-ml:基于随机半监督学习的高通量基因组筛选中的疾病非特异性基因优先级排序。

Mantis-ml: Disease-Agnostic Gene Prioritization from High-Throughput Genomic Screens by Stochastic Semi-supervised Learning.

机构信息

Centre for Genomics Research, Discovery Sciences, BioPharmaceuticals R&D, AstraZeneca, 1 Francis Crick Avenue, CB2 0RE Cambridge, UK.

出版信息

Am J Hum Genet. 2020 May 7;106(5):659-678. doi: 10.1016/j.ajhg.2020.03.012.

DOI:10.1016/j.ajhg.2020.03.012
PMID:32386536
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7212270/
Abstract

Access to large-scale genomics datasets has increased the utility of hypothesis-free genome-wide analyses. However, gene signals are often insufficiently powered to reach experiment-wide significance, triggering a process of laborious triaging of genomic-association-study results. We introduce mantis-ml, a multi-dimensional, multi-step machine-learning framework that allows objective assessment of the biological relevance of genes to disease studies. Mantis-ml is an automated machine-learning framework that follows a multi-model approach of stochastic semi-supervised learning to rank disease-associated genes through iterative learning sessions on random balanced datasets across the protein-coding exome. When applied to a range of human diseases, including chronic kidney disease (CKD), epilepsy, and amyotrophic lateral sclerosis (ALS), mantis-ml achieved an average area under curve (AUC) prediction performance of 0.81-0.89. Critically, to prove its value as a tool that can be used to interpret exome-wide association studies, we overlapped mantis-ml predictions with data from published cohort-level association studies. We found a statistically significant enrichment of high mantis-ml predictions among the highest-ranked genes from hypothesis-free cohort-level statistics, indicating a substantial improvement over the performance of current state-of-the-art methods and pointing to the capture of true prioritization signals for disease-associated genes. Finally, we introduce a generic mantis-ml score (GMS) trained with over 1,200 features as a generic-disease-likelihood estimator, outperforming published gene-level scores. In addition to our tool, we provide a gene prioritization atlas that includes mantis-ml's predictions across ten disease areas and empowers researchers to interactively navigate through the gene-triaging framework. Mantis-ml is an intuitive tool that supports the objective triaging of large-scale genomic discovery studies and enhances our understanding of complex genotype-phenotype associations.

摘要

大规模基因组数据集的获取增加了无假设全基因组分析的实用性。然而,基因信号通常不足以达到全实验范围的显著性,从而引发了对基因组关联研究结果进行繁琐分类的过程。我们引入了 mantis-ml,这是一个多维、多步骤的机器学习框架,允许客观评估基因与疾病研究的生物学相关性。mantis-ml 是一个自动化的机器学习框架,它采用随机半监督学习的多模型方法,通过在蛋白质编码外显子的随机平衡数据集中进行迭代学习会议,对疾病相关基因进行排名。当应用于一系列人类疾病,包括慢性肾脏病 (CKD)、癫痫和肌萎缩性侧索硬化症 (ALS) 时,mantis-ml 实现了 0.81-0.89 的平均曲线下面积 (AUC) 预测性能。至关重要的是,为了证明它作为一种可用于解释外显子全关联研究的工具的价值,我们将 mantis-ml 的预测与已发表的队列水平关联研究的数据重叠。我们发现,在无假设的队列水平统计中排名最高的基因中,高 mantis-ml 预测的显著富集,这表明它的性能明显优于当前最先进的方法,并指出了对疾病相关基因的真正优先级信号的捕获。最后,我们引入了一个基于超过 1200 个特征训练的通用 mantis-ml 分数 (GMS),作为通用疾病可能性估计器,其性能优于已发表的基因分数。除了我们的工具,我们还提供了一个基因优先级图谱,其中包括 mantis-ml 在十个疾病领域的预测,使研究人员能够交互式地浏览基因分类框架。mantis-ml 是一个直观的工具,支持大规模基因组发现研究的客观分类,并增强了我们对复杂基因型-表型关联的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/345a8850b115/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/20a9adee3233/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/71a8a4d2d36a/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/af3544be776b/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/6cd54f89acbe/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/573a8b5b4049/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/345a8850b115/gr6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/20a9adee3233/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/71a8a4d2d36a/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/af3544be776b/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/6cd54f89acbe/gr4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/573a8b5b4049/gr5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e633/7212270/345a8850b115/gr6.jpg

相似文献

1
Mantis-ml: Disease-Agnostic Gene Prioritization from High-Throughput Genomic Screens by Stochastic Semi-supervised Learning.Mantis-ml:基于随机半监督学习的高通量基因组筛选中的疾病非特异性基因优先级排序。
Am J Hum Genet. 2020 May 7;106(5):659-678. doi: 10.1016/j.ajhg.2020.03.012.
2
Phenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data.利用知识图谱、图神经网络和英国生物库数据进行表型全基因组治疗靶点鉴定。
Sci Adv. 2024 May 10;10(19):eadj1424. doi: 10.1126/sciadv.adj1424. Epub 2024 May 8.
3
A novel candidate disease gene prioritization method using deep graph convolutional networks and semi-supervised learning.一种使用深度图卷积网络和半监督学习的新型候选疾病基因优先级排序方法。
BMC Bioinformatics. 2022 Oct 14;23(1):422. doi: 10.1186/s12859-022-04954-x.
4
A Knowledge-Based Machine Learning Approach to Gene Prioritisation in Amyotrophic Lateral Sclerosis.基于知识的机器学习方法在肌萎缩侧索硬化症中的基因优先级排序。
Genes (Basel). 2020 Jun 19;11(6):668. doi: 10.3390/genes11060668.
5
Deep semi-supervised learning ensemble framework for classifying co-mentions of human proteins and phenotypes.深度半监督学习集成框架,用于分类人类蛋白质和表型的共提及。
BMC Bioinformatics. 2021 Oct 16;22(1):500. doi: 10.1186/s12859-021-04421-z.
6
Prioritization of retinal disease genes: an integrative approach.视网膜疾病基因优先级排序:一种综合方法。
Hum Mutat. 2013 Jun;34(6):853-9. doi: 10.1002/humu.22317. Epub 2013 Apr 12.
7
Self-Supervised Feature Learning and Phenotyping for Assessing Age-Related Macular Degeneration Using Retinal Fundus Images.使用视网膜眼底图像评估年龄相关性黄斑变性的自监督特征学习和表型分析。
Ophthalmol Retina. 2022 Feb;6(2):116-129. doi: 10.1016/j.oret.2021.06.010. Epub 2021 Jul 2.
8
Semi-supervised learning improves regulatory sequence prediction with unlabeled sequences.半监督学习利用未标记序列提高调控序列预测。
BMC Bioinformatics. 2023 May 5;24(1):186. doi: 10.1186/s12859-023-05303-2.
9
What Can Machine Learning Approaches in Genomics Tell Us about the Molecular Basis of Amyotrophic Lateral Sclerosis?基因组学中的机器学习方法能让我们了解肌萎缩侧索硬化症的分子基础吗?
J Pers Med. 2020 Nov 26;10(4):247. doi: 10.3390/jpm10040247.
10
A semi-supervised learning-based quality evaluation system for digital chest radiographs.基于半监督学习的数字胸片质量评估系统。
Med Phys. 2023 Nov;50(11):6789-6800. doi: 10.1002/mp.16663. Epub 2023 Aug 6.

引用本文的文献

1
Genome-wide prediction of dominant and recessive neurodevelopmental disorder-associated genes.全基因组对显性和隐性神经发育障碍相关基因的预测。
Am J Hum Genet. 2025 Mar 6;112(3):693-708. doi: 10.1016/j.ajhg.2025.02.001. Epub 2025 Feb 26.
2
Disease prediction with multi-omics and biomarkers empowers case-control genetic discoveries in the UK Biobank.多组学和生物标志物的疾病预测使英国生物库中的病例对照遗传发现成为可能。
Nat Genet. 2024 Sep;56(9):1821-1831. doi: 10.1038/s41588-024-01898-1. Epub 2024 Sep 11.
3
AI approaches for the discovery and validation of drug targets.

本文引用的文献

1
Rare-variant collapsing analyses for complex traits: guidelines and applications.复杂性状的罕见变异合并分析:指南与应用。
Nat Rev Genet. 2019 Dec;20(12):747-759. doi: 10.1038/s41576-019-0177-4. Epub 2019 Oct 11.
2
Exome-Based Rare-Variant Analyses in CKD.基于外显子组的慢性肾脏病罕见变异分析。
J Am Soc Nephrol. 2019 Jun;30(6):1109-1122. doi: 10.1681/ASN.2018090909. Epub 2019 May 13.
3
Machine Learning for Integrating Data in Biology and Medicine: Principles, Practice, and Opportunities.用于整合生物学和医学数据的机器学习:原理、实践与机遇
用于药物靶点发现与验证的人工智能方法。
Camb Prism Precis Med. 2024 May 24;2:e7. doi: 10.1017/pcm.2024.4. eCollection 2024.
4
Phenome-wide identification of therapeutic genetic targets, leveraging knowledge graphs, graph neural networks, and UK Biobank data.利用知识图谱、图神经网络和英国生物库数据进行表型全基因组治疗靶点鉴定。
Sci Adv. 2024 May 10;10(19):eadj1424. doi: 10.1126/sciadv.adj1424. Epub 2024 May 8.
5
Machine Learning to Advance Human Genome-Wide Association Studies.机器学习在全基因组关联研究中的应用
Genes (Basel). 2023 Dec 25;15(1):34. doi: 10.3390/genes15010034.
6
Predict, diagnose, and treat chronic kidney disease with machine learning: a systematic literature review.利用机器学习预测、诊断和治疗慢性肾脏病:系统文献回顾。
J Nephrol. 2023 May;36(4):1101-1117. doi: 10.1007/s40620-023-01573-4. Epub 2023 Feb 14.
7
Integration of genome-scale data identifies candidate sleep regulators.整合基因组规模数据识别候选睡眠调节剂。
Sleep. 2023 Feb 8;46(2). doi: 10.1093/sleep/zsac279.
8
DrugnomeAI is an ensemble machine-learning framework for predicting druggability of candidate drug targets.DrugnomeAI 是一个用于预测候选药物靶点可药性的集成机器学习框架。
Commun Biol. 2022 Nov 24;5(1):1291. doi: 10.1038/s42003-022-04245-4.
9
Selecting the right therapeutic target for kidney disease.为肾病选择合适的治疗靶点。
Front Pharmacol. 2022 Nov 2;13:971065. doi: 10.3389/fphar.2022.971065. eCollection 2022.
10
Human genetics uncovers as an obesity-independent therapeutic target for diabetes.人类遗传学揭示了作为一种肥胖症无关的糖尿病治疗靶点。
Sci Adv. 2022 Nov 18;8(46):eadd5430. doi: 10.1126/sciadv.add5430. Epub 2022 Nov 16.
Inf Fusion. 2019 Oct;50:71-91. doi: 10.1016/j.inffus.2018.09.012. Epub 2018 Sep 21.
4
COL4A5 and LAMA5 variants co-inherited in familial hematuria: digenic inheritance or genetic modifier effect?COL4A5 和 LAMA5 变异体在家族性血尿中共同遗传:双基因遗传或遗传修饰效应?
BMC Nephrol. 2018 May 16;19(1):114. doi: 10.1186/s12882-018-0906-5.
5
Genetic variants in the LAMA5 gene in pediatric nephrotic syndrome.LAMA5 基因中的遗传变异与儿科肾病综合征。
Nephrol Dial Transplant. 2019 Mar 1;34(3):485-493. doi: 10.1093/ndt/gfy028.
6
Central role of dysregulation of TGF-β/Smad in CKD progression and potential targets of its treatment.TGF-β/Smad 失调在 CKD 进展中的核心作用及其治疗的潜在靶点。
Biomed Pharmacother. 2018 May;101:670-681. doi: 10.1016/j.biopha.2018.02.090. Epub 2018 Mar 22.
7
Integrative omics for health and disease.整体医学组学与健康和疾病。
Nat Rev Genet. 2018 May;19(5):299-310. doi: 10.1038/nrg.2018.4. Epub 2018 Feb 26.
8
Mouse Genome Database (MGD)-2018: knowledgebase for the laboratory mouse.小鼠基因组数据库(MGD)-2018:实验小鼠知识库。
Nucleic Acids Res. 2018 Jan 4;46(D1):D836-D842. doi: 10.1093/nar/gkx1006.
9
A novel frameshift mutation of in a Japanese family with autosomal recessive cerebellar ataxia type 8.一个患有常染色体隐性遗传性8型小脑共济失调的日本家族中,一种新型的移码突变。
Hum Genome Var. 2017 Oct 26;4:17052. doi: 10.1038/hgv.2017.52. eCollection 2017.
10
Optimizing genomic medicine in epilepsy through a gene-customized approach to missense variant interpretation.通过基因定制的方法对错义变异进行解释,优化癫痫的基因组医学。
Genome Res. 2017 Oct;27(10):1715-1729. doi: 10.1101/gr.226589.117. Epub 2017 Sep 1.