• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从垃圾到宝藏:检测未映射 NGS 数据中的意外污染。

From trash to treasure: detecting unexpected contamination in unmapped NGS data.

机构信息

Stazione Zoologica Anton Dohrn, Villa Comunale, Napoli, 80121, Italy.

High Performance Computing and Networking Institute, National Research Council of Italy, Via P. Castellino, 111, Napoli, 80131, Italy.

出版信息

BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):168. doi: 10.1186/s12859-019-2684-x.

DOI:10.1186/s12859-019-2684-x
PMID:30999839
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6472186/
Abstract

BACKGROUND

Next Generation Sequencing (NGS) experiments produce millions of short sequences that, mapped to a reference genome, provide biological insights at genomic, transcriptomic and epigenomic level. Typically the amount of reads that correctly maps to the reference genome ranges between 70% and 90%, leaving in some cases a consistent fraction of unmapped sequences. This 'misalignment' can be ascribed to low quality bases or sequence differences between the sample reads and the reference genome. Investigating the source of the unmapped reads is definitely important to better assess the quality of the whole experiment and to check for possible downstream or upstream 'contamination' from exogenous nucleic acids.

RESULTS

Here we propose DecontaMiner, a tool to unravel the presence of contaminating sequences among the unmapped reads. It uses a subtraction approach to identify bacteria, fungi and viruses genome contamination. DecontaMiner generates several output files to track all the processed reads, and to provide a complete report of their characteristics. The good quality matches on microorganism genomes are counted and compared among samples. DecontaMiner builds an offline HTML page containing summary statistics and plots. The latter are obtained using the state-of-the-art D3 javascript libraries. DecontaMiner has been mainly used to detect contamination in human RNA-Seq data. The software is freely available at http://www-labgtp.na.icar.cnr.it/decontaminer .

CONCLUSIONS

DecontaMiner is a tool designed and developed to investigate the presence of contaminating sequences in unmapped NGS data. It can suggest the presence of contaminating organisms in sequenced samples, that might derive either from laboratory contamination or from their biological source, and in both cases can be considered as worthy of further investigation and experimental validation. The novelty of DecontaMiner is mainly represented by its easy integration with the standard procedures of NGS data analysis, while providing a complete, reliable, and automatic pipeline.

摘要

背景

下一代测序(NGS)实验产生了数百万条短序列,这些序列映射到参考基因组后,可提供基因组、转录组和表观基因组水平的生物学见解。通常,正确映射到参考基因组的读取量在 70%到 90%之间,在某些情况下,仍会有一部分未映射的序列。这种“未对齐”可能归因于低质量碱基或样本读取与参考基因组之间的序列差异。研究未映射读取的来源对于更好地评估整个实验的质量并检查是否存在外源核酸的下游或上游“污染”至关重要。

结果

在这里,我们提出了 DecontaMiner,这是一种用于揭示未映射读取中存在污染序列的工具。它使用减法方法来识别细菌、真菌和病毒基因组污染。DecontaMiner 生成多个输出文件来跟踪所有处理的读取,并提供其特征的完整报告。对微生物基因组的高质量匹配进行计数并在样本之间进行比较。DecontaMiner 构建了一个包含摘要统计信息和图表的离线 HTML 页面。后者使用最先进的 D3 JavaScript 库获得。DecontaMiner 主要用于检测人类 RNA-Seq 数据中的污染。该软件可免费在 http://www-labgtp.na.icar.cnr.it/decontaminer 获得。

结论

DecontaMiner 是一种设计和开发用于调查未映射 NGS 数据中存在污染序列的工具。它可以提示测序样本中存在污染生物,这些生物可能来自实验室污染或其生物来源,在这两种情况下,都值得进一步调查和实验验证。DecontaMiner 的新颖之处主要在于它与 NGS 数据分析的标准程序的轻松集成,同时提供了一个完整、可靠和自动化的流程。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/3598366dedf4/12859_2019_2684_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/c870e3f4f9e6/12859_2019_2684_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/75e7c71494f1/12859_2019_2684_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/ecae51e54c88/12859_2019_2684_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/27d9855e1adc/12859_2019_2684_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/3598366dedf4/12859_2019_2684_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/c870e3f4f9e6/12859_2019_2684_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/75e7c71494f1/12859_2019_2684_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/ecae51e54c88/12859_2019_2684_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/27d9855e1adc/12859_2019_2684_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e4e/6472186/3598366dedf4/12859_2019_2684_Fig5_HTML.jpg

相似文献

1
From trash to treasure: detecting unexpected contamination in unmapped NGS data.从垃圾到宝藏:检测未映射 NGS 数据中的意外污染。
BMC Bioinformatics. 2019 Apr 18;20(Suppl 4):168. doi: 10.1186/s12859-019-2684-x.
2
Exploring the unmapped DNA and RNA reads in a songbird genome.探索鸣禽基因组中的未映射 DNA 和 RNA 读数。
BMC Genomics. 2019 Jan 8;20(1):19. doi: 10.1186/s12864-018-5378-2.
3
SQUAT: a Sequencing Quality Assessment Tool for data quality assessments of genome assemblies.SQUAT:用于基因组组装数据质量评估的测序质量评估工具。
BMC Genomics. 2019 Apr 18;19(Suppl 9):238. doi: 10.1186/s12864-019-5445-3.
4
Vipie: web pipeline for parallel characterization of viral populations from multiple NGS samples.Vipie:用于对多个二代测序样本中的病毒群体进行并行特征分析的网络管道。
BMC Genomics. 2017 May 15;18(1):378. doi: 10.1186/s12864-017-3721-7.
5
Another lesson from unmapped reads: in-depth analysis of RNA-Seq reads from various horse tissues.另一个来自未映射reads 的教训:对来自不同马组织的 RNA-Seq reads 的深度分析。
J Appl Genet. 2022 Sep;63(3):571-581. doi: 10.1007/s13353-022-00705-z. Epub 2022 Jun 7.
6
Short Sequence Aligner Benchmarking for Chromatin Research.短序列比对工具在染色质研究中的基准测试。
Int J Mol Sci. 2023 Sep 14;24(18):14074. doi: 10.3390/ijms241814074.
7
SeqAssist: a novel toolkit for preliminary analysis of next-generation sequencing data.SeqAssist:一种用于下一代测序数据初步分析的新型工具包。
BMC Bioinformatics. 2014;15 Suppl 11(Suppl 11):S10. doi: 10.1186/1471-2105-15-S11-S10. Epub 2014 Oct 21.
8
Lacking alignments? The next-generation sequencing mapper segemehl revisited.缺少比对结果?新一代测序映射器segemehl再探讨。
Bioinformatics. 2014 Jul 1;30(13):1837-43. doi: 10.1093/bioinformatics/btu146. Epub 2014 Mar 13.
9
Re-alignment of the unmapped reads with base quality score.将未映射的 reads 与碱基质量得分重新比对。
BMC Bioinformatics. 2015;16 Suppl 5(Suppl 5):S8. doi: 10.1186/1471-2105-16-S5-S8. Epub 2015 Mar 18.
10
Gencore: an efficient tool to generate consensus reads for error suppressing and duplicate removing of NGS data.Gencore:一种高效的工具,用于生成共识读数,以抑制 NGS 数据的错误并去除重复。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):606. doi: 10.1186/s12859-019-3280-9.

引用本文的文献

1
Targeted decontamination of sequencing data with CLEAN.使用CLEAN对测序数据进行靶向净化。
NAR Genom Bioinform. 2025 Jul 24;7(3):lqaf105. doi: 10.1093/nargab/lqaf105. eCollection 2025 Sep.
2
Mining Porcine Blood Whole-DNA Sequencing Datasets to Uncover Pig Viromes: An Exploratory Application to Identify Potential Infecting Agents of an Undefined Disease Outbreak.挖掘猪全血DNA测序数据集以揭示猪病毒组:一项用于识别未定义疾病暴发潜在感染因子的探索性应用。
Vet Sci. 2025 May 24;12(6):513. doi: 10.3390/vetsci12060513.
3
Advantages of Mutant Generation by Genome Rearrangements of Non-Conventional Yeast via Direct Nuclease Transfection.

本文引用的文献

1
Patterns of cross-contamination in a multispecies population genomic project: detection, quantification, impact, and solutions.一个多物种群体基因组计划中的交叉污染模式:检测、量化、影响及解决方案。
BMC Biol. 2017 Mar 29;15(1):25. doi: 10.1186/s12915-017-0366-6.
2
Biomartr: genomic data retrieval with R.Biomartr:使用R进行基因组数据检索。
Bioinformatics. 2017 Apr 15;33(8):1216-1217. doi: 10.1093/bioinformatics/btw821.
3
Bacteriophages in clinical samples can interfere with microbiological diagnostic tools.临床样本中的噬菌体可能会干扰微生物诊断工具。
通过直接核酸酶转染对非常规酵母进行基因组重排产生突变体的优势。
Genes Cells. 2025 Mar;30(2):e70010. doi: 10.1111/gtc.70010.
4
Routine Detection of Viruses Through Metagenomics: Where Do We Stand?通过宏基因组学进行病毒的常规检测:我们目前的进展如何?
Am J Trop Med Hyg. 2024 Dec 24;112(3):479-480. doi: 10.4269/ajtmh.24-0652. Print 2025 Mar 5.
5
Discarded sequencing reads uncover natural variation in pest resistance in .丢弃的测序读数揭示了……中害虫抗性的自然变异。 (注:原文中“in”后面缺少具体内容)
Elife. 2024 Dec 19;13:RP95510. doi: 10.7554/eLife.95510.
6
MetaAll: integrative bioinformatics workflow for analysing clinical metagenomic data.MetaAll:用于分析临床宏基因组数据的综合生物信息学工作流程。
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae597.
7
Sensitivity of transcriptomics: Different samples and methodology alter conclusions in Gulf pipefish (Syngnathus scovelli).转录组学的敏感性:不同样本和方法改变了海湾尖嘴鱼(Syngnathus scovelli)研究中的结论。
J Hered. 2025 Mar 1;116(2):139-148. doi: 10.1093/jhered/esae067.
8
Whole genome resequencing unveils low-temperature stress tolerance specific genomic variations in jute (Corchorus sp.).全基因组重测序揭示了黄麻(Corchorus sp.)耐低温胁迫的特定基因组变异。
J Genet Eng Biotechnol. 2024 Jun;22(2):100376. doi: 10.1016/j.jgeb.2024.100376. Epub 2024 Apr 9.
9
Impaired signaling pathways on Berardinelli-Seip congenital lipodystrophy macrophages during Leishmania infantum infection.贝伦迪尔-西普先天性脂肪营养不良巨噬细胞在感染利什曼原虫时信号通路受损。
Sci Rep. 2024 May 16;14(1):11236. doi: 10.1038/s41598-024-61663-6.
10
Toward a Predictive Understanding of Cyanobacterial Harmful Algal Blooms through AI Integration of Physical, Chemical, and Biological Data.通过人工智能整合物理、化学和生物数据,对蓝藻有害藻华进行预测性理解。
ACS ES T Water. 2023 Nov 30;4(3):844-858. doi: 10.1021/acsestwater.3c00369. eCollection 2024 Mar 8.
Sci Rep. 2016 Sep 9;6:33000. doi: 10.1038/srep33000.
4
A microbial perspective of human developmental biology.人类发育生物学的微生物视角。
Nature. 2016 Jul 7;535(7610):48-55. doi: 10.1038/nature18845.
5
Next-generation sequencing diagnostics of bacteremia in septic patients.脓毒症患者菌血症的新一代测序诊断
Genome Med. 2016 Jul 1;8(1):73. doi: 10.1186/s13073-016-0326-8.
6
Inherent bacterial DNA contamination of extraction and sequencing reagents may affect interpretation of microbiota in low bacterial biomass samples.提取和测序试剂中固有的细菌DNA污染可能会影响低细菌生物量样本中微生物群的解读。
Gut Pathog. 2016 May 26;8:24. doi: 10.1186/s13099-016-0103-7. eCollection 2016.
7
The human gut microbiome impacts health and disease.人类肠道微生物群影响健康与疾病。
C R Biol. 2016 Jul-Aug;339(7-8):319-23. doi: 10.1016/j.crvi.2016.04.008. Epub 2016 May 25.
8
Population-level analysis of gut microbiome variation.人群水平的肠道微生物组变异分析。
Science. 2016 Apr 29;352(6285):560-4. doi: 10.1126/science.aad3503. Epub 2016 Apr 28.
9
Propionibacterium acnes: Disease-Causing Agent or Common Contaminant? Detection in Diverse Patient Samples by Next-Generation Sequencing.痤疮丙酸杆菌:致病因子还是常见污染物?通过新一代测序技术在多种患者样本中的检测
J Clin Microbiol. 2016 Apr;54(4):980-7. doi: 10.1128/JCM.02723-15. Epub 2016 Jan 27.
10
Occurrence of Fungal DNA Contamination in PCR Reagents: Approaches to Control and Decontamination.聚合酶链式反应(PCR)试剂中真菌DNA污染的发生:控制与去污方法
J Clin Microbiol. 2016 Jan;54(1):148-52. doi: 10.1128/JCM.02112-15. Epub 2015 Nov 11.