Suppr超能文献

NeoMutate:一种用于癌症体细胞突变预测的集成机器学习框架。

NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer.

机构信息

OncoImmunity AS, Oslo Cancer Cluster, Ullernchausseen 64/66, 0379, Oslo, Norway.

出版信息

BMC Med Genomics. 2019 May 16;12(1):63. doi: 10.1186/s12920-019-0508-5.

Abstract

BACKGROUND

The accurate screening of tumor genomic landscapes for somatic mutations using high-throughput sequencing involves a crucial step in precise clinical diagnosis and targeted therapy. However, the complex inherent features of cancer tissue, especially, tumor genetic intra-heterogeneity coupled with the problem of sequencing and alignment artifacts, makes somatic variant calling a challenging task. Current variant filtering strategies, such as rule-based filtering and consensus voting of different algorithms, have previously helped to increase specificity, although comes at the cost of sensitivity.

METHODS

In light of this, we have developed the NeoMutate framework which incorporates 7 supervised machine learning (ML) algorithms to exploit the strengths of multiple variant callers, using a non-redundant set of biological and sequence features. We benchmarked NeoMutate by simulating more than 10,000 bona fide cancer-related mutations into three well-characterized Genome in a Bottle (GIAB) reference samples.

RESULTS

A robust and exhaustive evaluation of NeoMutate's performance based on 5-fold cross validation experiments, in addition to 3 independent tests, demonstrated a substantially improved variant detection accuracy compared to any of its individual composite variant callers and consensus calling of multiple tools.

CONCLUSIONS

We show here that integrating multiple tools in an ensemble ML layer optimizes somatic variant detection rates, leading to a potentially improved variant selection framework for the diagnosis and treatment of cancer.

摘要

背景

使用高通量测序对肿瘤基因组进行体细胞突变的准确筛选,是精确临床诊断和靶向治疗的关键步骤。然而,肿瘤组织的固有复杂性,特别是肿瘤遗传异质性,以及测序和比对伪影问题,使得体细胞变异检测成为一项具有挑战性的任务。目前的变异过滤策略,如基于规则的过滤和不同算法的共识投票,虽然提高了特异性,但也降低了敏感性。

方法

针对这一问题,我们开发了 NeoMutate 框架,该框架结合了 7 种监督机器学习(ML)算法,利用一组非冗余的生物学和序列特征,充分利用多个变异调用器的优势。我们通过将超过 10000 个真实的癌症相关突变模拟到三个经过充分表征的基因组瓶(GIAB)参考样本中,对 NeoMutate 的性能进行了稳健和详尽的评估。

结果

基于 5 折交叉验证实验和 3 个独立测试的全面评估表明,与任何单一复合变异调用器或多种工具的共识调用相比,NeoMutate 的变异检测准确性都有显著提高。

结论

我们在这里表明,在集成机器学习层中整合多个工具可以优化体细胞变异检测率,从而为癌症的诊断和治疗提供一个潜在改进的变异选择框架。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc72/6524241/590087316676/12920_2019_508_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验