• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

NeoMutate:一种用于癌症体细胞突变预测的集成机器学习框架。

NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer.

机构信息

OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway.

出版信息

BMC Med Genomics. 2019 May 16;12(1):63. doi: 10.1186/s12920-019-0508-5.

DOI:10.1186/s12920-019-0508-5
PMID:31096972
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6524241/
Abstract

BACKGROUND

The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity.

METHODS

In light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples.

RESULTS

A robust and exhaustive evaluation of NeoMutate's performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools.

CONCLUSIONS

We show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer.

摘要

背景

使用高通量测序对肿瘤基因组进行体细胞突变的准确筛选,是精确临床诊断和靶向治疗的关键步骤。然而,肿瘤组织的固有复杂性,特别是肿瘤遗传异质性,以及测序和比对伪影问题,使得体细胞变异检测成为一项具有挑战性的任务。目前的变异过滤策略,如基于规则的过滤和不同算法的共识投票,虽然提高了特异性,但也降低了敏感性。

方法

针对这一问题,我们开发了 NeoMutate 框架,该框架结合了 7 种监督机器学习(ML)算法,利用一组非冗余的生物学和序列特征,充分利用多个变异调用器的优势。我们通过将超过 10000 个真实的癌症相关突变模拟到三个经过充分表征的基因组瓶(GIAB)参考样本中,对 NeoMutate 的性能进行了稳健和详尽的评估。

结果

基于 5 折交叉验证实验和 3 个独立测试的全面评估表明,与任何单一复合变异调用器或多种工具的共识调用相比,NeoMutate 的变异检测准确性都有显著提高。

结论

我们在这里表明,在集成机器学习层中整合多个工具可以优化体细胞变异检测率,从而为癌症的诊断和治疗提供一个潜在改进的变异选择框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/908aaddb77c1/12920_2019_508_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/590087316676/12920_2019_508_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/394ad3fe9a50/12920_2019_508_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/aa84a2457548/12920_2019_508_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/89f9f1445ae2/12920_2019_508_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/fc76058d0c1d/12920_2019_508_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/89a62fa8b79a/12920_2019_508_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/908aaddb77c1/12920_2019_508_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/590087316676/12920_2019_508_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/394ad3fe9a50/12920_2019_508_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/aa84a2457548/12920_2019_508_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/89f9f1445ae2/12920_2019_508_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/fc76058d0c1d/12920_2019_508_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/89a62fa8b79a/12920_2019_508_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/908aaddb77c1/12920_2019_508_Fig7_HTML.jpg

相似文献

1
NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer.NeoMutate:一种用于癌症体细胞突变预测的集成机器学习框架。
BMC Med Genomics. 2019 May 16;12(1):63. doi: 10.1186/s12920-019-0508-5.
2
Ensemble-Based Somatic Mutation Calling in Cancer Genomes.基于集成的癌症基因组体细胞突变calling。
Methods Mol Biol. 2020;2120:37-46. doi: 10.1007/978-1-0716-0327-7_3.
3
SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing.SNooPer:一种基于机器学习从低深度下一代测序中识别体细胞变异的方法。
BMC Genomics. 2016 Nov 14;17(1):912. doi: 10.1186/s12864-016-3281-2.
4
Accurate Ensemble Prediction of Somatic Mutations with SMuRF2.SMuRF2 实现体细胞突变的精确集成预测。
Methods Mol Biol. 2022;2493:53-66. doi: 10.1007/978-1-0716-2293-3_4.
5
SomaticCombiner: improving the performance of somatic variant calling based on evaluation tests and a consensus approach.SomaticCombiner:基于评估测试和共识方法提高体细胞变异calling 的性能。
Sci Rep. 2020 Jul 30;10(1):12898. doi: 10.1038/s41598-020-69772-8.
6
SMuRF: portable and accurate ensemble prediction of somatic mutations.SMuRF:体细胞突变的便携式精确集成预测
Bioinformatics. 2019 Sep 1;35(17):3157-3159. doi: 10.1093/bioinformatics/btz018.
7
SomaticSeq: An Ensemble and Machine Learning Method to Detect Somatic Mutations.SomaticSeq:一种用于检测体细胞突变的集成和机器学习方法。
Methods Mol Biol. 2020;2120:47-70. doi: 10.1007/978-1-0716-0327-7_4.
8
Improving somatic exome sequencing performance by biological replicates.通过生物学重复提高体细胞外显子组测序性能。
BMC Bioinformatics. 2024 Mar 22;25(1):124. doi: 10.1186/s12859-024-05742-5.
9
A Novel Affordable and Reliable Framework for Accurate Detection and Comprehensive Analysis of Somatic Mutations in Cancer.一种新型的经济实惠且可靠的框架,用于准确检测和全面分析癌症中的体细胞突变。
Int J Mol Sci. 2024 Jul 24;25(15):8044. doi: 10.3390/ijms25158044.
10
Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data.评估九种体细胞变异检测工具在全外显子组测序和靶向深度测序数据中检测体细胞突变的性能
PLoS One. 2016 Mar 22;11(3):e0151664. doi: 10.1371/journal.pone.0151664. eCollection 2016.

引用本文的文献

1
The Role of Artificial Intelligence in Identifying Gene Variants and Improving Diagnosis.人工智能在识别基因变异和改善诊断方面的作用。
Genes (Basel). 2025 May 7;16(5):560. doi: 10.3390/genes16050560.
2
Predicting high confidence ctDNA somatic variants with ensemble machine learning models.使用集成机器学习模型预测高置信度的ctDNA体细胞变异
Sci Rep. 2025 May 26;15(1):18384. doi: 10.1038/s41598-025-01326-2.
3
Neoantigen-based immunotherapy: advancing precision medicine in cancer and glioblastoma treatment through discovery and innovation.

本文引用的文献

1
Genome-wide somatic variant calling using localized colored de Bruijn graphs.使用局部彩色德布鲁因图进行全基因组体细胞变异检测
Commun Biol. 2018 Mar 22;1:20. doi: 10.1038/s42003-018-0023-9. eCollection 2018.
2
A universal SNP and small-indel variant caller using deep neural networks.使用深度神经网络的通用 SNP 和小插入缺失变体调用器。
Nat Biotechnol. 2018 Nov;36(10):983-987. doi: 10.1038/nbt.4235. Epub 2018 Sep 24.
3
Strelka2: fast and accurate calling of germline and somatic variants.Strelka2:快速准确地调用种系和体细胞变异。
基于新抗原的免疫疗法:通过发现与创新推动癌症和胶质母细胞瘤治疗的精准医学发展。
Explor Target Antitumor Ther. 2025 Apr 27;6:1002313. doi: 10.37349/etat.2025.1002313. eCollection 2025.
4
Rare disease genomics and precision medicine.罕见病基因组学与精准医学。
Genomics Inform. 2024 Dec 3;22(1):28. doi: 10.1186/s44342-024-00032-1.
5
Emerging research trends in artificial intelligence for cancer diagnostic systems: A comprehensive review.癌症诊断系统人工智能的新兴研究趋势:全面综述
Heliyon. 2024 Aug 23;10(17):e36743. doi: 10.1016/j.heliyon.2024.e36743. eCollection 2024 Sep 15.
6
Benchmarking UMI-aware and standard variant callers for low frequency ctDNA variant detection.基于 UMIs 的低频 ctDNA 变异检测与标准变异 caller 的基准测试
BMC Genomics. 2024 Sep 3;25(1):827. doi: 10.1186/s12864-024-10737-w.
7
A Novel Affordable and Reliable Framework for Accurate Detection and Comprehensive Analysis of Somatic Mutations in Cancer.一种新型的经济实惠且可靠的框架,用于准确检测和全面分析癌症中的体细胞突变。
Int J Mol Sci. 2024 Jul 24;25(15):8044. doi: 10.3390/ijms25158044.
8
COSAP: Comparative Sequencing Analysis Platform.COSAP:比较测序分析平台。
BMC Bioinformatics. 2024 Mar 26;25(1):130. doi: 10.1186/s12859-024-05756-z.
9
Artificial intelligence and database for NGS-based diagnosis in rare disease.基于二代测序的罕见病诊断人工智能与数据库
Front Genet. 2024 Jan 25;14:1258083. doi: 10.3389/fgene.2023.1258083. eCollection 2023.
10
The interplay between neoantigens and immune cells in sarcomas treated with checkpoint inhibition.肉瘤中经检查点抑制治疗后,新生抗原与免疫细胞的相互作用。
Front Immunol. 2023 Sep 20;14:1226445. doi: 10.3389/fimmu.2023.1226445. eCollection 2023.
Nat Methods. 2018 Aug;15(8):591-594. doi: 10.1038/s41592-018-0051-x. Epub 2018 Jul 16.
4
A review of somatic single nucleotide variant calling algorithms for next-generation sequencing data.用于下一代测序数据的体细胞单核苷酸变异检测算法综述。
Comput Struct Biotechnol J. 2018 Feb 6;16:15-24. doi: 10.1016/j.csbj.2018.01.003. eCollection 2018.
5
CoVaCS: a consensus variant calling system.CoVaCS:一个共识变异调用系统。
BMC Genomics. 2018 Feb 5;19(1):120. doi: 10.1186/s12864-018-4508-1.
6
Prevalence and detection of low-allele-fraction variants in clinical cancer samples.临床癌症样本中低频等位基因变异的流行和检测。
Nat Commun. 2017 Nov 9;8(1):1377. doi: 10.1038/s41467-017-01470-y.
7
BBMerge - Accurate paired shotgun read merging via overlap.BBMerge - 通过重叠实现准确的双端鸟枪法读段合并。
PLoS One. 2017 Oct 26;12(10):e0185056. doi: 10.1371/journal.pone.0185056. eCollection 2017.
8
Comprehensive benchmarking of SNV callers for highly admixed tumor data.针对高度混合肿瘤数据的单核苷酸变异(SNV)检测工具的综合基准测试。
PLoS One. 2017 Oct 11;12(10):e0186175. doi: 10.1371/journal.pone.0186175. eCollection 2017.
9
A three-caller pipeline for variant analysis of cancer whole-exome sequencing data.一种用于癌症全外显子组测序数据变异分析的三调用者流程。
Mol Med Rep. 2017 May;15(5):2489-2494. doi: 10.3892/mmr.2017.6336. Epub 2017 Mar 16.
10
Detecting very low allele fraction variants using targeted DNA sequencing and a novel molecular barcode-aware variant caller.使用靶向DNA测序和一种新型分子条形码感知变异检测工具来检测极低等位基因分数的变异。
BMC Genomics. 2017 Jan 3;18(1):5. doi: 10.1186/s12864-016-3425-4.