• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SAG-QC:基于序列组成减去非目标序列对单扩增基因组信息进行质量控制。

SAG-QC: quality control of single amplified genome information by subtracting non-target sequences based on sequence compositions.

作者信息

Maruyama Toru, Mori Tetsushi, Yamagishi Keisuke, Takeyama Haruko

机构信息

Department of Life Science & Medical Bioscience, Graduate School of Advanced Science & Engineering, Waseda University, 3-4-1 Okubo, Shinjuku, Tokyo, 169-8555, Japan.

Computational Bio-Big Data Open Innovation Lab., National Institute of Advanced Science and Technology, 3-4-1 Okubo, Shinjuku, Tokyo, 169-0072, Japan.

出版信息

BMC Bioinformatics. 2017 Mar 4;18(1):152. doi: 10.1186/s12859-017-1572-5.

DOI:10.1186/s12859-017-1572-5
PMID:28259144
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5336615/
Abstract

BACKGROUND

Whole genome amplification techniques have enabled the analysis of unexplored genomic information by sequencing of single-amplified genomes (SAGs). Whole genome amplification of single bacteria is currently challenging because contamination often occurs in experimental processes. Thus, to increase the confidence in the analyses of sequenced SAGs, bioinformatics approaches that identify and exclude non-target sequences from SAGs are required. Since currently reported approaches utilize sequence information in public databases, they have limitations when new strains are the targets of interest. Here, we developed a software SAG-QC that identify and exclude non-target sequences independent of database.

RESULTS

In our method, "no template control" sequences acquired during WGA were used. We calculated the probability that a sequence was derived from contaminants by comparing k-mer compositions with the no template control sequences. Based on the results of tests using simulated SAG datasets, the accuracy of our method for predicting non-target sequences was higher than that of currently reported techniques. Subsequently, we applied our tool to actual SAG datasets and evaluated the accuracy of the predictions.

CONCLUSIONS

Our method works independently of public sequence information for distinguishing SAGs from non-target sequences. This method will be effective when employed against SAG sequences of unexplored strains and we anticipate that it will contribute to the correct interpretation of SAGs.

摘要

背景

全基因组扩增技术使得通过对单扩增基因组(SAGs)进行测序来分析未探索的基因组信息成为可能。目前,对单个细菌进行全基因组扩增具有挑战性,因为在实验过程中经常会发生污染。因此,为了提高对测序SAGs分析结果的可信度,需要采用生物信息学方法来识别并排除SAGs中的非目标序列。由于目前报道的方法利用公共数据库中的序列信息,当新菌株是感兴趣的目标时,它们存在局限性。在此,我们开发了一款名为SAG-QC的软件,该软件能够独立于数据库识别并排除非目标序列。

结果

在我们的方法中,使用了全基因组扩增过程中获得的“无模板对照”序列。通过将k-mer组成与无模板对照序列进行比较,我们计算了一个序列源自污染物的概率。基于使用模拟SAG数据集进行测试的结果,我们的方法预测非目标序列的准确性高于目前报道的技术。随后,我们将我们的工具应用于实际的SAG数据集,并评估了预测的准确性。

结论

我们的方法独立于公共序列信息来区分SAGs和非目标序列。当应用于未探索菌株的SAG序列时,该方法将是有效的,并且我们预计它将有助于对SAGs进行正确的解读。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/749e609f2e70/12859_2017_1572_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/1aebdf825e2f/12859_2017_1572_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/ef53383d015c/12859_2017_1572_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/fa5439311d21/12859_2017_1572_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/f1c8d163aac4/12859_2017_1572_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/8509c7105b41/12859_2017_1572_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/c4c86c199042/12859_2017_1572_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/749e609f2e70/12859_2017_1572_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/1aebdf825e2f/12859_2017_1572_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/ef53383d015c/12859_2017_1572_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/fa5439311d21/12859_2017_1572_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/f1c8d163aac4/12859_2017_1572_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/8509c7105b41/12859_2017_1572_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/c4c86c199042/12859_2017_1572_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0aa3/5336615/749e609f2e70/12859_2017_1572_Fig7_HTML.jpg

相似文献

1
SAG-QC: quality control of single amplified genome information by subtracting non-target sequences based on sequence compositions.SAG-QC:基于序列组成减去非目标序列对单扩增基因组信息进行质量控制。
BMC Bioinformatics. 2017 Mar 4;18(1):152. doi: 10.1186/s12859-017-1572-5.
2
Strain-level profiling of viable microbial community by selective single-cell genome sequencing.通过选择性单细胞基因组测序对存活微生物群落进行菌株水平分析。
Sci Rep. 2022 Mar 15;12(1):4443. doi: 10.1038/s41598-022-08401-y.
3
Monodisperse Picoliter Droplets for Low-Bias and Contamination-Free Reactions in Single-Cell Whole Genome Amplification.用于单细胞全基因组扩增中低偏差和无污染物反应的单分散皮升液滴
PLoS One. 2015 Sep 21;10(9):e0138733. doi: 10.1371/journal.pone.0138733. eCollection 2015.
4
Evaluation of single-cell genomics to address evolutionary questions using three SAGs of the choanoflagellate Monosiga brevicollis.利用三种有孔虫 Monosiga brevicollis 的 SAG 评估单细胞基因组学来解决进化问题。
Sci Rep. 2017 Sep 8;7(1):11025. doi: 10.1038/s41598-017-11466-9.
5
NGS-QC Generator: A Quality Control System for ChIP-Seq and Related Deep Sequencing-Generated Datasets.NGS-QC生成器:一种用于ChIP-Seq及相关深度测序生成数据集的质量控制系统。
Methods Mol Biol. 2016;1418:243-65. doi: 10.1007/978-1-4939-3578-9_13.
6
Neptune: a bioinformatics tool for rapid discovery of genomic variation in bacterial populations.海王星:一种用于快速发现细菌群体基因组变异的生物信息学工具。
Nucleic Acids Res. 2017 Oct 13;45(18):e159. doi: 10.1093/nar/gkx702.
7
Obtaining high-quality draft genomes from uncultured microbes by cleaning and co-assembly of single-cell amplified genomes.通过清洁和单细胞扩增基因组的共组装从未培养的微生物中获得高质量的草图基因组。
Sci Rep. 2018 Feb 1;8(1):2059. doi: 10.1038/s41598-018-20384-3.
8
MeCorS: Metagenome-enabled error correction of single cell sequencing reads.MeCorS:基于宏基因组的单细胞测序读数纠错
Bioinformatics. 2016 Jul 15;32(14):2199-201. doi: 10.1093/bioinformatics/btw144. Epub 2016 Mar 15.
9
GeneViTo: visualizing gene-product functional and structural features in genomic datasets.GeneViTo:在基因组数据集中可视化基因产物的功能和结构特征。
BMC Bioinformatics. 2003 Oct 31;4:53. doi: 10.1186/1471-2105-4-53.
10
Phylogenomic analysis of bacterial and archaeal sequences with AMPHORA2.使用 AMPHORA2 进行细菌和古菌序列的系统发育基因组分析。
Bioinformatics. 2012 Apr 1;28(7):1033-4. doi: 10.1093/bioinformatics/bts079. Epub 2012 Feb 12.

引用本文的文献

1
dna2bit: high performance genomic distance estimation software for microbial genome analysis.dna2bit:用于微生物基因组分析的高性能基因组距离估计软件。
Front Microbiol. 2024 Dec 23;15:1521181. doi: 10.3389/fmicb.2024.1521181. eCollection 2024.
2
Exploring the Frozen Armory: Antiphage Defense Systems in Cold-Adapted Bacteria with a Focus on CRISPR-Cas Systems.探索冷冻军械库:以CRISPR-Cas系统为重点的冷适应细菌中的抗噬菌体防御系统
Microorganisms. 2024 May 20;12(5):1028. doi: 10.3390/microorganisms12051028.
3
Obtaining high-quality draft genomes from uncultured microbes by cleaning and co-assembly of single-cell amplified genomes.

本文引用的文献

1
ProDeGe: a computational protocol for fully automated decontamination of genomes.ProDeGe:一种用于基因组全自动净化的计算协议。
ISME J. 2016 Jan;10(1):269-72. doi: 10.1038/ismej.2015.100. Epub 2015 Jun 9.
2
Metagenomic resolution of microbial functions in deep-sea hydrothermal plumes across the Eastern Lau Spreading Center.东劳扩张中心深海热液羽流中微生物功能的宏基因组解析。
ISME J. 2016 Jan;10(1):225-39. doi: 10.1038/ismej.2015.81. Epub 2015 Jun 5.
3
Microfluidic whole genome amplification device for single cell sequencing.
通过清洁和单细胞扩增基因组的共组装从未培养的微生物中获得高质量的草图基因组。
Sci Rep. 2018 Feb 1;8(1):2059. doi: 10.1038/s41598-018-20384-3.
4
Massively parallel whole genome amplification for single-cell sequencing using droplet microfluidics.利用液滴微流控技术进行单细胞测序的大规模并行全基因组扩增。
Sci Rep. 2017 Jul 12;7(1):5199. doi: 10.1038/s41598-017-05436-4.
用于单细胞测序的微流控全基因组扩增装置
Anal Chem. 2014 Oct 7;86(19):9386-90. doi: 10.1021/ac5032176. Epub 2014 Sep 22.
4
Improved multiple displacement amplification (iMDA) and ultraclean reagents.改良的多重置换扩增(iMDA)和超净试剂。
BMC Genomics. 2014 Jun 6;15(1):443. doi: 10.1186/1471-2164-15-443.
5
Kraken: ultrafast metagenomic sequence classification using exact alignments.克拉肯:使用精确比对的超快速宏基因组序列分类
Genome Biol. 2014 Mar 3;15(3):R46. doi: 10.1186/gb-2014-15-3-r46.
6
Single cell genomics of uncultured, health-associated Tannerella BU063 (Oral Taxon 286) and comparison to the closely related pathogen Tannerella forsythia.未培养的、与健康相关的坦纳氏菌属BU063(口腔分类群286)的单细胞基因组学及其与密切相关的病原体福赛斯坦纳氏菌的比较。
PLoS One. 2014 Feb 14;9(2):e89398. doi: 10.1371/journal.pone.0089398. eCollection 2014.
7
An environmental bacterial taxon with a large and distinct metabolic repertoire.具有庞大而独特代谢谱的环境细菌分类群。
Nature. 2014 Feb 6;506(7486):58-62. doi: 10.1038/nature12959. Epub 2014 Jan 29.
8
Insights into the phylogeny and coding potential of microbial dark matter.微生物暗物质的系统发育和编码潜力的研究进展
Nature. 2013 Jul 25;499(7459):431-7. doi: 10.1038/nature12352. Epub 2013 Jul 14.
9
Candidate phylum TM6 genome recovered from a hospital sink biofilm provides genomic insights into this uncultivated phylum.从医院水槽生物膜中回收的候选门 TM6 基因组为研究这一未培养门提供了基因组见解。
Proc Natl Acad Sci U S A. 2013 Jun 25;110(26):E2390-9. doi: 10.1073/pnas.1219809110. Epub 2013 Jun 10.
10
Genome-wide detection of single-nucleotide and copy-number variations of a single human cell.单个人类细胞中单核苷酸和拷贝数变异的全基因组检测。
Science. 2012 Dec 21;338(6114):1622-6. doi: 10.1126/science.1229164.