• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

对齐基因组数据的去噪。

Denoising of Aligned Genomic Data.

机构信息

Stanford University, Department of Electrical Engineering, Stanford, 94305, USA.

University of Illinois Urbana-Champaign, Department of Electrical and Computer Engineering, Urbana, 61801, USA.

出版信息

Sci Rep. 2019 Oct 21;9(1):15067. doi: 10.1038/s41598-019-51418-z.

DOI:10.1038/s41598-019-51418-z
PMID:31636330
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6803637/
Abstract

Noise in genomic sequencing data is known to have effects on various stages of genomic data analysis pipelines. Variant identification is an important step of many of these pipelines, and is increasingly being used in clinical settings to aid medical practices. We propose a denoising method, dubbed SAMDUDE, which operates on aligned genomic data in order to improve variant calling performance. Denoising human data with SAMDUDE resulted in improved variant identification in both individual chromosome as well as whole genome sequencing (WGS) data sets. In the WGS data set, denoising led to identification of almost 2,000 additional true variants, and elimination of over 1,500 erroneously identified variants. In contrast, we found that denoising with other state-of-the-art denoisers significantly worsens variant calling performance. SAMDUDE is written in Python and is freely available at https://github.com/ihwang/SAMDUDE .

摘要

基因组测序数据中的噪声已知会对基因组数据分析管道的各个阶段产生影响。变异识别是这些管道中的许多步骤的重要步骤,并且越来越多地用于临床环境中以辅助医疗实践。我们提出了一种去噪方法,称为 SAMDUDE,它在对齐的基因组数据上运行,以提高变异调用性能。用 SAMDUDE 对人类数据进行去噪导致个体染色体和全基因组测序 (WGS) 数据集的变异识别得到改善。在 WGS 数据集,去噪导致鉴定出近 2000 个额外的真实变异,消除了 1500 多个错误识别的变异。相比之下,我们发现用其他最先进的去噪器进行去噪会显著降低变异调用性能。SAMDUDE 是用 Python 编写的,可以在 https://github.com/ihwang/SAMDUDE 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/73e19a9c254d/41598_2019_51418_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/dee9a68beab3/41598_2019_51418_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/24d50393d8c7/41598_2019_51418_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/f2238d2afa57/41598_2019_51418_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/73e19a9c254d/41598_2019_51418_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/dee9a68beab3/41598_2019_51418_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/24d50393d8c7/41598_2019_51418_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/f2238d2afa57/41598_2019_51418_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3915/6803637/73e19a9c254d/41598_2019_51418_Fig4_HTML.jpg

相似文献

1
Denoising of Aligned Genomic Data.对齐基因组数据的去噪。
Sci Rep. 2019 Oct 21;9(1):15067. doi: 10.1038/s41598-019-51418-z.
2
ReliableGenome: annotation of genomic regions with high/low variant calling concordance.可靠基因组:具有高/低变异检测一致性的基因组区域注释。
Bioinformatics. 2017 Jan 15;33(2):155-160. doi: 10.1093/bioinformatics/btw587. Epub 2016 Sep 7.
3
Alternate-locus aware variant calling in whole genome sequencing.全基因组测序中位点交替感知变异检测
Genome Med. 2016 Dec 13;8(1):130. doi: 10.1186/s13073-016-0383-z.
4
Genomic variant-identification methods may alter transmission inferences.基因组变异识别方法可能改变传播推断。
Microb Genom. 2020 Aug;6(8). doi: 10.1099/mgen.0.000418. Epub 2020 Jul 31.
5
CALQ: compression of quality values of aligned sequencing data.CALQ:对齐测序数据的质量值压缩。
Bioinformatics. 2018 May 15;34(10):1650-1658. doi: 10.1093/bioinformatics/btx737.
6
Variant Review with the Integrative Genomics Viewer.使用综合基因组浏览器进行变异审查。
Cancer Res. 2017 Nov 1;77(21):e31-e34. doi: 10.1158/0008-5472.CAN-17-0337.
7
Systematic comparison of germline variant calling pipelines cross multiple next-generation sequencers.跨多种下一代测序仪的种系变异调用管道的系统比较。
Sci Rep. 2019 Jun 27;9(1):9345. doi: 10.1038/s41598-019-45835-3.
8
Noise cancellation using total variation for copy number variation detection.利用全变差降噪进行拷贝数变异检测。
BMC Bioinformatics. 2018 Oct 22;19(Suppl 11):361. doi: 10.1186/s12859-018-2332-x.
9
On the association analysis of genome-sequencing data: A spatial clustering approach for partitioning the entire genome into nonoverlapping windows.关于基因组测序数据的关联分析:一种将整个基因组划分为非重叠窗口的空间聚类方法。
Genet Epidemiol. 2017 May;41(4):332-340. doi: 10.1002/gepi.22040. Epub 2017 Mar 20.
10
Bamgineer: Introduction of simulated allele-specific copy number variants into exome and targeted sequence data sets.Bamgineer:外显子组和靶向序列数据集模拟等位基因特异性拷贝数变异的引入。
PLoS Comput Biol. 2018 Mar 28;14(3):e1006080. doi: 10.1371/journal.pcbi.1006080. eCollection 2018 Mar.

引用本文的文献

1
Lost in .*VCF Translation. From Data Fragmentation to Precision Genomics: Technical, Ethical, and Interpretive Challenges in the Post-Sequencing Era.迷失在.*VCF 翻译中。从数据碎片化到精准基因组学:测序后时代的技术、伦理和解释挑战。
J Pers Med. 2025 Aug 20;15(8):390. doi: 10.3390/jpm15080390.
2
A survey of k-mer methods and applications in bioinformatics.生物信息学中k-mer方法及其应用综述。
Comput Struct Biotechnol J. 2024 May 21;23:2289-2303. doi: 10.1016/j.csbj.2024.05.025. eCollection 2024 Dec.
3
Illumina reads correction: evaluation and improvements.

本文引用的文献

1
Systematic evaluation of error rates and causes in short samples in next-generation sequencing.下一代测序中短样本的错误率和原因的系统评估。
Sci Rep. 2018 Jul 19;8(1):10950. doi: 10.1038/s41598-018-29325-6.
2
Phenotype risk scores identify patients with unrecognized Mendelian disease patterns.表型风险评分可识别出具有未被识别的孟德尔疾病模式的患者。
Science. 2018 Mar 16;359(6381):1233-1239. doi: 10.1126/science.aal4043.
3
Denoising genome-wide histone ChIP-seq with convolutional neural networks.使用卷积神经网络对全基因组组蛋白 ChIP-seq 进行去噪。
Illumina测序读数校正:评估与改进
Sci Rep. 2024 Jan 26;14(1):2232. doi: 10.1038/s41598-024-52386-9.
4
CARE 2.0: reducing false-positive sequencing error corrections using machine learning.CARE 2.0:利用机器学习减少假阳性测序错误纠正。
BMC Bioinformatics. 2022 Jun 13;23(1):227. doi: 10.1186/s12859-022-04754-3.
5
noisyR: enhancing biological signal in sequencing datasets by characterizing random technical noise.noisyR:通过对随机技术噪声进行特征化来增强测序数据集的生物信号。
Nucleic Acids Res. 2021 Aug 20;49(14):e83. doi: 10.1093/nar/gkab433.
Bioinformatics. 2017 Jul 15;33(14):i225-i233. doi: 10.1093/bioinformatics/btx243.
4
Evaluation of the impact of Illumina error correction tools on de novo genome assembly.评估Illumina纠错工具对从头基因组组装的影响。
BMC Bioinformatics. 2017 Aug 18;18(1):374. doi: 10.1186/s12859-017-1784-8.
5
DUDE-Seq: Fast, flexible, and robust denoising for targeted amplicon sequencing.DUDE-Seq:用于靶向扩增子测序的快速、灵活且稳健的去噪方法
PLoS One. 2017 Jul 27;12(7):e0181463. doi: 10.1371/journal.pone.0181463. eCollection 2017.
6
BLESS 2: accurate, memory-efficient and fast error correction method.BLESS 2:精确、内存高效且快速的纠错方法。
Bioinformatics. 2016 Aug 1;32(15):2369-71. doi: 10.1093/bioinformatics/btw146. Epub 2016 Mar 24.
7
A research roadmap for next-generation sequencing informatics.下一代测序信息学的研究路线图。
Sci Transl Med. 2016 Apr 20;8(335):335ps10. doi: 10.1126/scitranslmed.aaf7314.
8
Effect of lossy compression of quality scores on variant calling.质量分数的有损压缩对变异检测的影响。
Brief Bioinform. 2017 Mar 1;18(2):183-194. doi: 10.1093/bib/bbw011.
9
Medical implications of technical accuracy in genome sequencing.基因组测序技术准确性的医学意义。
Genome Med. 2016 Mar 2;8(1):24. doi: 10.1186/s13073-016-0269-0.
10
Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction.去噪DNA深度测序数据——高通量测序错误及其校正
Brief Bioinform. 2016 Jan;17(1):154-79. doi: 10.1093/bib/bbv029. Epub 2015 May 29.