• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SMuRF:体细胞突变的便携式精确集成预测

SMuRF: portable and accurate ensemble prediction of somatic mutations.

作者信息

Huang Weitai, Guo Yu Amanda, Muthukumar Karthik, Baruah Probhonjon, Chang Mei Mei, Jacobsen Skanderup Anders

机构信息

Department of Computational and Systems Biology, Agency for Science Technology and Research, Genome Institute of Singapore, Singapore, Singapore.

Graduate School of Integrative Sciences and Engineering, National University of Singapore, Singapore, Singapore.

出版信息

Bioinformatics. 2019 Sep 1;35(17):3157-3159. doi: 10.1093/bioinformatics/btz018.

DOI:10.1093/bioinformatics/btz018
PMID:30649191
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6735703/
Abstract

SUMMARY

Somatic Mutation calling method using a Random Forest (SMuRF) integrates predictions and auxiliary features from multiple somatic mutation callers using a supervised machine learning approach. SMuRF is trained on community-curated matched tumor and normal whole genome sequencing data. SMuRF predicts both SNVs and indels with high accuracy in genome or exome-level sequencing data. Furthermore, the method is robust across multiple tested cancer types and predicts low allele frequency variants with high accuracy. In contrast to existing ensemble-based somatic mutation calling approaches, SMuRF works out-of-the-box and is orders of magnitudes faster.

AVAILABILITY AND IMPLEMENTATION

The method is implemented in R and available at https://github.com/skandlab/SMuRF. SMuRF operates as an add-on to the community-developed bcbio-nextgen somatic variant calling pipeline.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

摘要

使用随机森林的体细胞突变检测方法(SMuRF)采用监督式机器学习方法,整合了来自多个体细胞突变检测工具的预测结果和辅助特征。SMuRF基于社区整理的匹配肿瘤和正常全基因组测序数据进行训练。在基因组或外显子水平测序数据中,SMuRF能高精度地预测单核苷酸变异(SNV)和插入缺失(indel)。此外,该方法在多种测试癌症类型中都表现稳健,能高精度地预测低等位基因频率变异。与现有的基于集成的体细胞突变检测方法相比,SMuRF开箱即用,速度快几个数量级。

可用性与实现方式

该方法用R语言实现,可在https://github.com/skandlab/SMuRF获取。SMuRF作为社区开发的bcbio-nextgen体细胞变异检测流程的插件运行。

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

1
SMuRF: portable and accurate ensemble prediction of somatic mutations.SMuRF:体细胞突变的便携式精确集成预测
Bioinformatics. 2019 Sep 1;35(17):3157-3159. doi: 10.1093/bioinformatics/btz018.
2
Ensemble-Based Somatic Mutation Calling in Cancer Genomes.基于集成的癌症基因组体细胞突变calling。
Methods Mol Biol. 2020;2120:37-46. doi: 10.1007/978-1-0716-0327-7_3.
3
Accurate Ensemble Prediction of Somatic Mutations with SMuRF2.SMuRF2 实现体细胞突变的精确集成预测。
Methods Mol Biol. 2022;2493:53-66. doi: 10.1007/978-1-0716-2293-3_4.
4
SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing.SNooPer:一种基于机器学习从低深度下一代测序中识别体细胞变异的方法。
BMC Genomics. 2016 Nov 14;17(1):912. doi: 10.1186/s12864-016-3281-2.
5
ISOWN: accurate somatic mutation identification in the absence of normal tissue controls.ISOWN:在无正常组织对照的情况下准确识别体细胞突变
Genome Med. 2017 Jun 29;9(1):59. doi: 10.1186/s13073-017-0446-9.
6
SNVSniffer: an integrated caller for germline and somatic single-nucleotide and indel mutations.SNVSniffer:一种用于种系和体细胞单核苷酸及插入缺失突变的综合检测工具。
BMC Syst Biol. 2016 Aug 1;10 Suppl 2(Suppl 2):47. doi: 10.1186/s12918-016-0300-5.
7
NeoMutate: an ensemble machine learning framework for the prediction of somatic mutations in cancer.NeoMutate:一种用于癌症体细胞突变预测的集成机器学习框架。
BMC Med Genomics. 2019 May 16;12(1):63. doi: 10.1186/s12920-019-0508-5.
8
SiNVICT: ultra-sensitive detection of single nucleotide variants and indels in circulating tumour DNA.SiNVICT:循环肿瘤 DNA 中单核苷酸变异和插入缺失的超灵敏检测。
Bioinformatics. 2017 Jan 1;33(1):26-34. doi: 10.1093/bioinformatics/btw536. Epub 2016 Aug 16.
9
neoepiscope improves neoepitope prediction with multivariant phasing.neoepiscope 通过多变量定相提高了新表位预测。
Bioinformatics. 2020 Feb 1;36(3):713-720. doi: 10.1093/bioinformatics/btz653.
10
Construction of a combinatorial pipeline using two somatic variant  calling  methods  for whole exome sequence data of gastric cancer.利用两种体细胞变异检测方法构建针对胃癌全外显子序列数据的组合流程。
J Med Invest. 2017;64(3.4):233-240. doi: 10.2152/jmi.64.233.

引用本文的文献

1
CAN-Scan: A multi-omic phenotype-driven precision oncology platform identifies prognostic biomarkers of therapy response for colorectal cancer.CAN-Scan:一个多组学表型驱动的精准肿瘤学平台可识别结直肠癌治疗反应的预后生物标志物。
Cell Rep Med. 2025 Apr 15;6(4):102053. doi: 10.1016/j.xcrm.2025.102053. Epub 2025 Apr 4.
2
Performance comparisons between clustering models for reconstructing NGS results from technical replicates.用于从技术重复样本中重建二代测序结果的聚类模型之间的性能比较。
Front Genet. 2023 Mar 16;14:1148147. doi: 10.3389/fgene.2023.1148147. eCollection 2023.
3
Fast, accurate, and racially unbiased pan-cancer tumor-only variant calling with tabular machine learning.

本文引用的文献

1
Reliability of Whole-Exome Sequencing for Assessing Intratumor Genetic Heterogeneity.全外显子组测序评估肿瘤内遗传异质性的可靠性。
Cell Rep. 2018 Nov 6;25(6):1446-1457. doi: 10.1016/j.celrep.2018.10.046.
2
A machine learning approach for somatic mutation discovery.机器学习在体细胞突变发现中的应用。
Sci Transl Med. 2018 Sep 5;10(457). doi: 10.1126/scitranslmed.aar7939.
3
Scalable Open Science Approach for Mutation Calling of Tumor Exomes Using Multiple Genomic Pipelines.采用多种基因组分析流水线的肿瘤外显子组突变调用的可扩展开放科学方法。
通过表格机器学习实现快速、准确且无种族偏差的全癌种仅肿瘤变异检测
NPJ Precis Oncol. 2023 Jan 7;7(1):4. doi: 10.1038/s41698-022-00340-1.
4
Accurate somatic variant detection using weakly supervised deep learning.利用弱监督深度学习进行准确的体细胞变异检测。
Nat Commun. 2022 Jul 22;13(1):4248. doi: 10.1038/s41467-022-31765-8.
5
Single-cell and bulk transcriptome sequencing identifies two epithelial tumor cell states and refines the consensus molecular classification of colorectal cancer.单细胞和批量转录组测序确定了两种上皮肿瘤细胞状态,并完善了结直肠癌的共识分子分类。
Nat Genet. 2022 Jul;54(7):963-975. doi: 10.1038/s41588-022-01100-4. Epub 2022 Jun 30.
6
Somatic and Germline Variant Calling from Next-Generation Sequencing Data.从下一代测序数据中进行体细胞和种系变异调用。
Adv Exp Med Biol. 2022;1361:37-54. doi: 10.1007/978-3-030-91836-1_3.
7
Computational analysis of cancer genome sequencing data.癌症基因组测序数据的计算分析。
Nat Rev Genet. 2022 May;23(5):298-314. doi: 10.1038/s41576-021-00431-y. Epub 2021 Dec 8.
8
Tissue-specific cell-free DNA degradation quantifies circulating tumor DNA burden.组织特异性游离 DNA 降解定量检测循环肿瘤 DNA 负担。
Nat Commun. 2021 Apr 13;12(1):2229. doi: 10.1038/s41467-021-22463-y.
9
Semi-supervised learning for somatic variant calling and peptide identification in personalized cancer immunotherapy.半监督学习在个体化癌症免疫治疗中的体细胞变异调用和肽鉴定中的应用。
BMC Bioinformatics. 2020 Dec 30;21(Suppl 18):498. doi: 10.1186/s12859-020-03813-x.
10
iWhale: a computational pipeline based on Docker and SCons for detection and annotation of somatic variants in cancer WES data.iWhale:一种基于 Docker 和 SCons 的计算流水线,用于检测和注释癌症 WES 数据中的体细胞变异。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa065.
Cell Syst. 2018 Mar 28;6(3):271-281.e7. doi: 10.1016/j.cels.2018.03.002.
4
Intersect-then-combine approach: improving the performance of somatic variant calling in whole exome sequencing data using multiple aligners and callers.交叉然后合并方法:使用多个比对器和变异检测工具提高全外显子组测序数据中体细胞变异检测的性能
Genome Med. 2017 Apr 18;9(1):35. doi: 10.1186/s13073-017-0425-1.
5
VarDict: a novel and versatile variant caller for next-generation sequencing in cancer research.VarDict:一种用于癌症研究中下一代测序的新型多功能变异检测工具。
Nucleic Acids Res. 2016 Jun 20;44(11):e108. doi: 10.1093/nar/gkw227. Epub 2016 Apr 7.
6
Evaluation of Nine Somatic Variant Callers for Detection of Somatic Mutations in Exome and Targeted Deep Sequencing Data.评估九种体细胞变异检测工具在全外显子组测序和靶向深度测序数据中检测体细胞突变的性能
PLoS One. 2016 Mar 22;11(3):e0151664. doi: 10.1371/journal.pone.0151664. eCollection 2016.
7
A comprehensive assessment of somatic mutation detection in cancer using whole-genome sequencing.利用全基因组测序对癌症中体细胞突变检测进行的全面评估。
Nat Commun. 2015 Dec 9;6:10001. doi: 10.1038/ncomms10001.
8
Systematic comparison of variant calling pipelines using gold standard personal exome variants.使用金标准个人外显子变体对变异检测流程进行系统比较。
Sci Rep. 2015 Dec 7;5:17875. doi: 10.1038/srep17875.
9
An ensemble approach to accurately detect somatic mutations using SomaticSeq.一种使用SomaticSeq准确检测体细胞突变的集成方法。
Genome Biol. 2015 Sep 17;16(1):197. doi: 10.1186/s13059-015-0758-2.
10
Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection.将肿瘤基因组模拟与众包相结合,以评估体细胞单核苷酸变异检测。
Nat Methods. 2015 Jul;12(7):623-30. doi: 10.1038/nmeth.3407. Epub 2015 May 18.