Suppr超能文献

一种使用下一代测序数据区分体细胞和种系突变的新型机器学习方法(svmSomatic)。

A novel machine learning approach (svmSomatic) to distinguish somatic and germline mutations using next-generation sequencing data.

机构信息

School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China.

School of Computer Science and Technology, Xidian University, Xi'an, Shaanxi 710071, China. E-mail:

出版信息

Zool Res. 2021 Mar 18;42(2):246-249. doi: 10.24272/j.issn.2095-8137.2021.014.

Abstract

Somatic mutations are a large category of genetic variations, which play an essential role in tumorigenesis. Detection of somatic single nucleotide variants (SNVs) could facilitate downstream analysis of tumorigenesis. Many computational methods have been developed to detect SNVs, but most require normal matched samples to differentiate somatic SNVs from the normal state, which can be difficult to obtain. Therefore, developing new approaches for detecting somatic SNVs without matched samples are crucial. In this work, we detected somatic mutations from individual tumor samples based on a novel machine learning approach, svmSomatic, using next-generation sequencing (NGS) data. In addition, as somatic SNV detection can be impacted by multiple mutations, with germline mutations and co-occurrence of copy number variations (CNVs) common in organisms, we used the novel approach to distinguish somatic and germline mutations based on the NGS data from individual tumor samples. In summary, svmSomatic: (1) considers the influence of CNV co-occurrence in detecting somatic mutations; and (2) trains a support vector machine algorithm to distinguish between somatic and germline mutations, without requiring normal matched samples. We further tested and compared svmSomatic with other common methods. Results showed that svmSomatic performance, as measured by F1-score, was significantly better than that of others using both simulation and real NGS data.

摘要

体细胞突变是一类重要的遗传变异,在肿瘤发生中起着关键作用。检测体细胞单核苷酸变异(SNV)有助于下游肿瘤发生的分析。已经开发了许多计算方法来检测 SNV,但大多数方法需要正常匹配的样本,以区分体细胞 SNV 与正常状态,这可能很难获得。因此,开发新的方法来检测没有匹配样本的体细胞 SNV 至关重要。在这项工作中,我们基于一种新的机器学习方法 svmSomatic,使用下一代测序(NGS)数据,从个体肿瘤样本中检测体细胞突变。此外,由于体细胞 SNV 检测可能受到多种突变的影响,并且在生物体中常见的种系突变和拷贝数变异(CNV)的共存,我们使用新的方法基于个体肿瘤样本的 NGS 数据来区分体细胞和种系突变。总之,svmSomatic:(1)在检测体细胞突变时考虑了 CNV 共存的影响;(2)训练了一个支持向量机算法来区分体细胞和种系突变,而不需要正常匹配的样本。我们进一步测试和比较了 svmSomatic 与其他常见方法。结果表明,svmSomatic 的性能,以 F1 分数衡量,使用模拟和真实 NGS 数据都明显优于其他方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d68b/7995270/3db8870f0112/zr-42-2-246-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验