• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用源自癌症样本参考数据集的深度学习模型实现稳健的体细胞突变检测。

Achieving robust somatic mutation detection with deep learning models derived from reference data sets of a cancer sample.

机构信息

Roche Sequencing Solutions, Santa Clara, CA, 95050, USA.

The Center for Biologics Evaluation and Research, U.S. Food and Drug Administration, 10903 New Hampshire Avenue, Silver Spring, MD, 20993, USA.

出版信息

Genome Biol. 2022 Jan 7;23(1):12. doi: 10.1186/s13059-021-02592-9.

DOI:10.1186/s13059-021-02592-9
PMID:34996510
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8740374/
Abstract

BACKGROUND

Accurate detection of somatic mutations is challenging but critical in understanding cancer formation, progression, and treatment. We recently proposed NeuSomatic, the first deep convolutional neural network-based somatic mutation detection approach, and demonstrated performance advantages on in silico data.

RESULTS

In this study, we use the first comprehensive and well-characterized somatic reference data sets from the SEQC2 consortium to investigate best practices for using a deep learning framework in cancer mutation detection. Using the high-confidence somatic mutations established for a cancer cell line by the consortium, we identify the best strategy for building robust models on multiple data sets derived from samples representing real scenarios, for example, a model trained on a combination of real and spike-in mutations had the highest average performance.

CONCLUSIONS

The strategy identified in our study achieved high robustness across multiple sequencing technologies for fresh and FFPE DNA input, varying tumor/normal purities, and different coverages, with significant superiority over conventional detection approaches in general, as well as in challenging situations such as low coverage, low variant allele frequency, DNA damage, and difficult genomic regions.

摘要

背景

准确检测体细胞突变对于理解癌症的形成、进展和治疗至关重要,但这极具挑战性。我们最近提出了基于深度卷积神经网络的体细胞突变检测方法 NeuSomatic,并在模拟数据上展示了性能优势。

结果

在这项研究中,我们使用了来自 SEQC2 联盟的第一个全面且特征良好的体细胞参考数据集,以研究在癌症突变检测中使用深度学习框架的最佳实践。我们利用该联盟为癌细胞系确定的高可信度体细胞突变,为代表真实情况的样本从多个数据集中构建稳健模型确定了最佳策略,例如,在真实突变和插入突变的组合上训练的模型具有最高的平均性能。

结论

我们在研究中确定的策略在多种测序技术中具有高度的稳健性,适用于新鲜和 FFPE DNA 输入、不同的肿瘤/正常纯度以及不同的覆盖度,与传统检测方法相比具有显著优势,在低覆盖度、低变异等位基因频率、DNA 损伤和困难基因组区域等具有挑战性的情况下优势更为明显。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/b1f590f832de/13059_2021_2592_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/9b82ba0e5d15/13059_2021_2592_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/9a1aaadf0604/13059_2021_2592_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/cb7ac991faa8/13059_2021_2592_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/afe5034d4723/13059_2021_2592_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/e42119452eab/13059_2021_2592_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/403098269ebc/13059_2021_2592_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/b1f590f832de/13059_2021_2592_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/9b82ba0e5d15/13059_2021_2592_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/9a1aaadf0604/13059_2021_2592_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/cb7ac991faa8/13059_2021_2592_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/afe5034d4723/13059_2021_2592_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/e42119452eab/13059_2021_2592_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/403098269ebc/13059_2021_2592_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/09cb/8740374/b1f590f832de/13059_2021_2592_Fig7_HTML.jpg

相似文献

1
Achieving robust somatic mutation detection with deep learning models derived from reference data sets of a cancer sample.利用源自癌症样本参考数据集的深度学习模型实现稳健的体细胞突变检测。
Genome Biol. 2022 Jan 7;23(1):12. doi: 10.1186/s13059-021-02592-9.
2
Deep convolutional neural networks for accurate somatic mutation detection.深度卷积神经网络用于准确的体细胞突变检测。
Nat Commun. 2019 Mar 4;10(1):1041. doi: 10.1038/s41467-019-09027-x.
3
Improving somatic exome sequencing performance by biological replicates.通过生物学重复提高体细胞外显子组测序性能。
BMC Bioinformatics. 2024 Mar 22;25(1):124. doi: 10.1186/s12859-024-05742-5.
4
DeepSSV: detecting somatic small variants in paired tumor and normal sequencing data with convolutional neural network.DeepSSV:使用卷积神经网络检测配对肿瘤和正常测序数据中的体细胞小变异。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa272.
5
DNN-Boost: Somatic mutation identification of tumor-only whole-exome sequencing data using deep neural network and XGBoost.DNN-Boost:使用深度神经网络和 XGBoost 对仅肿瘤全外显子测序数据进行体细胞突变识别。
J Bioinform Comput Biol. 2021 Dec;19(6):2140017. doi: 10.1142/S0219720021400175. Epub 2021 Dec 13.
6
Machine learning random forest for predicting oncosomatic variant NGS analysis.机器学习随机森林预测肿瘤体细胞变异 NGS 分析。
Sci Rep. 2021 Nov 8;11(1):21820. doi: 10.1038/s41598-021-01253-y.
7
CPEM: Accurate cancer type classification based on somatic alterations using an ensemble of a random forest and a deep neural network.CPEM:基于随机森林和深度神经网络集成的体细胞改变的准确癌症类型分类。
Sci Rep. 2019 Nov 15;9(1):16927. doi: 10.1038/s41598-019-53034-3.
8
Mutation-Attention (MuAt): deep representation learning of somatic mutations for tumour typing and subtyping.突变注意力(MuAt):用于肿瘤分型和亚型分类的体细胞突变的深度表示学习。
Genome Med. 2023 Jul 7;15(1):47. doi: 10.1186/s13073-023-01204-4.
9
NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer.NeoMutate:一种用于癌症体细胞突变预测的集成机器学习框架。
BMC Med Genomics. 2019 May 16;12(1):63. doi: 10.1186/s12920-019-0508-5.
10
MABAL: a Novel Deep-Learning Architecture for Machine-Assisted Bone Age Labeling.MABAL:一种用于机器辅助骨龄标注的新型深度学习架构。
J Digit Imaging. 2018 Aug;31(4):513-519. doi: 10.1007/s10278-018-0053-3.

引用本文的文献

1
Mutational landscape of pure ductal carcinoma in situ and associations with disease prognosis and response to radiotherapy.纯导管原位癌的突变图谱及其与疾病预后和放疗反应的关联。
Breast Cancer Res. 2025 Jul 8;27(1):127. doi: 10.1186/s13058-025-02080-z.
2
A benchmarking study of individual somatic variant callers and voting-based ensembles for whole-exome sequencing.全外显子组测序中个体体细胞变异检测工具及基于投票的集成方法的基准研究
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae697.
3
Reference Materials for Improving Reliability of Multiomics Profiling.

本文引用的文献

1
Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing.建立社区参考样本、数据和调用集,用于使用全基因组测序进行癌症突变检测的基准测试。
Nat Biotechnol. 2021 Sep;39(9):1151-1160. doi: 10.1038/s41587-021-00993-6. Epub 2021 Sep 9.
2
Toward best practice in cancer mutation detection with whole-genome and whole-exome sequencing.实现全基因组和全外显子组测序中癌症基因突变检测的最佳实践。
Nat Biotechnol. 2021 Sep;39(9):1141-1150. doi: 10.1038/s41587-021-00994-5. Epub 2021 Sep 9.
3
A unified haplotype-based method for accurate and comprehensive variant calling.
提高多组学分析可靠性的参考材料。
Phenomics. 2024 Mar 6;4(5):487-521. doi: 10.1007/s43657-023-00153-7. eCollection 2024 Oct.
4
Role of artificial intelligence in haematolymphoid diagnostics.人工智能在血液淋巴系统诊断中的作用。
Histopathology. 2025 Jan;86(1):58-68. doi: 10.1111/his.15327. Epub 2024 Oct 22.
5
Artificial Intelligence in Oncology: Current Landscape, Challenges, and Future Directions.人工智能在肿瘤学中的应用:现状、挑战与未来方向。
Cancer Discov. 2024 May 1;14(5):711-726. doi: 10.1158/2159-8290.CD-23-1199.
6
Deep learning in cancer genomics and histopathology.深度学习在癌症基因组学和组织病理学中的应用。
Genome Med. 2024 Mar 27;16(1):44. doi: 10.1186/s13073-024-01315-6.
7
Machine learning and deep learning for brain tumor MRI image segmentation.机器学习和深度学习在脑肿瘤 MRI 图像分割中的应用。
Exp Biol Med (Maywood). 2023 Nov;248(21):1974-1992. doi: 10.1177/15353702231214259. Epub 2023 Dec 16.
8
Haplotype-resolved assemblies and variant benchmark of a Chinese Quartet.单体型解析组装与中国四重奏个体的变异基准
Genome Biol. 2023 Dec 4;24(1):277. doi: 10.1186/s13059-023-03116-3.
9
Novel research and future prospects of artificial intelligence in cancer diagnosis and treatment.人工智能在癌症诊断和治疗中的新研究与未来展望。
J Hematol Oncol. 2023 Nov 27;16(1):114. doi: 10.1186/s13045-023-01514-5.
10
AIVariant: a deep learning-based somatic variant detector for highly contaminated tumor samples.AIVariant:一种基于深度学习的体细胞变异检测工具,可用于高度污染的肿瘤样本。
Exp Mol Med. 2023 Aug;55(8):1734-1742. doi: 10.1038/s12276-023-01049-2. Epub 2023 Aug 1.
基于统一单倍型的精确和全面变异calling 方法。
Nat Biotechnol. 2021 Jul;39(7):885-892. doi: 10.1038/s41587-021-00861-3. Epub 2021 Mar 29.
4
Accuracy and efficiency of germline variant calling pipelines for human genome data.人类基因组数据种系变异调用管道的准确性和效率。
Sci Rep. 2020 Nov 19;10(1):20222. doi: 10.1038/s41598-020-77218-4.
5
Benchmarking variant callers in next-generation and third-generation sequencing analysis.在新一代和第三代测序分析中对变异调用程序进行基准测试。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa148.
6
Pan-cancer analysis of whole genomes.泛癌症全基因组分析。
Nature. 2020 Feb;578(7793):82-93. doi: 10.1038/s41586-020-1969-6. Epub 2020 Feb 5.
7
Best practices for benchmarking germline small-variant calls in human genomes.人类基因组中小变异calls 的基准测试最佳实践。
Nat Biotechnol. 2019 May;37(5):555-560. doi: 10.1038/s41587-019-0054-x. Epub 2019 Mar 11.
8
Deep convolutional neural networks for accurate somatic mutation detection.深度卷积神经网络用于准确的体细胞突变检测。
Nat Commun. 2019 Mar 4;10(1):1041. doi: 10.1038/s41467-019-09027-x.
9
Genome-wide somatic variant calling using localized colored de Bruijn graphs.使用局部彩色德布鲁因图进行全基因组体细胞变异检测
Commun Biol. 2018 Mar 22;1:20. doi: 10.1038/s42003-018-0023-9. eCollection 2018.
10
Strelka2: fast and accurate calling of germline and somatic variants.Strelka2:快速准确地调用种系和体细胞变异。
Nat Methods. 2018 Aug;15(8):591-594. doi: 10.1038/s41592-018-0051-x. Epub 2018 Jul 16.