• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PANDAseq:适用于 Illumina 序列的 paired-end 组装程序。

PANDAseq: paired-end assembler for illumina sequences.

机构信息

Department of Biology, University of Waterloo, Waterloo, Ontario, Canada.

出版信息

BMC Bioinformatics. 2012 Feb 14;13:31. doi: 10.1186/1471-2105-13-31.

DOI:10.1186/1471-2105-13-31
PMID:22333067
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3471323/
Abstract

BACKGROUND

Illumina paired-end reads are used to analyse microbial communities by targeting amplicons of the 16S rRNA gene. Publicly available tools are needed to assemble overlapping paired-end reads while correcting mismatches and uncalled bases; many errors could be corrected to obtain higher sequence yields using quality information.

RESULTS

PANDAseq assembles paired-end reads rapidly and with the correction of most errors. Uncertain error corrections come from reads with many low-quality bases identified by upstream processing. Benchmarks were done using real error masks on simulated data, a pure source template, and a pooled template of genomic DNA from known organisms. PANDAseq assembled reads more rapidly and with reduced error incorporation compared to alternative methods.

CONCLUSIONS

PANDAseq rapidly assembles sequences and scales to billions of paired-end reads. Assembly of control libraries showed a 4-50% increase in the number of assembled sequences over naïve assembly with negligible loss of "good" sequence.

摘要

背景

Illumina 配对末端读取用于通过靶向 16S rRNA 基因的扩增子来分析微生物群落。需要公共可用的工具来组装重叠的配对末端读取,同时纠正错配和未呼叫的碱基;使用质量信息可以纠正许多错误以获得更高的序列产量。

结果

PANDAseq 快速组装并纠正了大多数错误的配对末端读取。不确定的错误纠正来自上游处理识别的许多低质量碱基的读取。使用模拟数据、纯源模板和已知生物的基因组 DNA 混合模板上的真实错误掩模进行基准测试。与替代方法相比,PANDAseq 更快地组装了读取并减少了错误的掺入。

结论

PANDAseq 快速组装序列并扩展到数十亿对末端读取。对照文库的组装显示,组装序列的数量比天真组装增加了 4-50%,而“良好”序列的损失可以忽略不计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/9eaad616b485/1471-2105-13-31-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/5151e956f9b0/1471-2105-13-31-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/6c251d496006/1471-2105-13-31-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/c37f18f93c40/1471-2105-13-31-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/9eaad616b485/1471-2105-13-31-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/5151e956f9b0/1471-2105-13-31-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/6c251d496006/1471-2105-13-31-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/c37f18f93c40/1471-2105-13-31-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/70c4/3471323/9eaad616b485/1471-2105-13-31-4.jpg

相似文献

1
PANDAseq: paired-end assembler for illumina sequences.PANDAseq:适用于 Illumina 序列的 paired-end 组装程序。
BMC Bioinformatics. 2012 Feb 14;13:31. doi: 10.1186/1471-2105-13-31.
2
MeFiT: merging and filtering tool for illumina paired-end reads for 16S rRNA amplicon sequencing.MeFiT:用于16S rRNA扩增子测序的Illumina双端读数的合并与过滤工具。
BMC Bioinformatics. 2016 Dec 1;17(1):491. doi: 10.1186/s12859-016-1358-1.
3
Generation of multimillion-sequence 16S rRNA gene libraries from complex microbial communities by assembling paired-end illumina reads.通过组装 Illumina 双端测序 reads 从复杂微生物群落中生成数百万个 16S rRNA 基因文库。
Appl Environ Microbiol. 2011 Jun;77(11):3846-52. doi: 10.1128/AEM.02772-10. Epub 2011 Apr 1.
4
IPED: a highly efficient denoising tool for Illumina MiSeq Paired-end 16S rRNA gene amplicon sequencing data.IPED:一种用于Illumina MiSeq双端16S rRNA基因扩增子测序数据的高效去噪工具。
BMC Bioinformatics. 2016 Apr 29;17(1):192. doi: 10.1186/s12859-016-1061-2.
5
A Comparison between Transcriptome Sequencing and 16S Metagenomics for Detection of Bacterial Pathogens in Wildlife.转录组测序与16S宏基因组学在野生动物细菌病原体检测中的比较
PLoS Negl Trop Dis. 2015 Aug 18;9(8):e0003929. doi: 10.1371/journal.pntd.0003929. eCollection 2015.
6
CDSnake: Snakemake pipeline for retrieval of annotated OTUs from paired-end reads using CD-HIT utilities.CDSnake:使用 CD-HIT 工具从配对末端读取中检索带注释的 OTU 的 Snakemake 管道。
BMC Bioinformatics. 2020 Jul 24;21(Suppl 12):303. doi: 10.1186/s12859-020-03591-6.
7
MinION™ nanopore sequencing of environmental metagenomes: a synthetic approach.环境宏基因组的MinION™纳米孔测序:一种合成方法。
Gigascience. 2017 Mar 1;6(3):1-10. doi: 10.1093/gigascience/gix007.
8
Illumina error correction near highly repetitive DNA regions improves de novo genome assembly.Illumina 纠错技术在高度重复 DNA 区域的应用提高了从头基因组组装的质量。
BMC Bioinformatics. 2019 Jun 3;20(1):298. doi: 10.1186/s12859-019-2906-2.
9
Benefits of merging paired-end reads before pre-processing environmental metagenomics data.在预处理环境宏基因组数据之前合并配对末端reads 的好处。
Mar Genomics. 2022 Feb;61:100914. doi: 10.1016/j.margen.2021.100914. Epub 2021 Dec 2.
10
Joining Illumina paired-end reads for classifying phylogenetic marker sequences.将 Illumina 配对末端读取用于分类系统发育标记序列。
BMC Bioinformatics. 2020 Mar 14;21(1):105. doi: 10.1186/s12859-020-3445-6.

引用本文的文献

1
Tissue-specific clonal selection and differentiation of CD4 T cells during infection.感染过程中CD4 T细胞的组织特异性克隆选择与分化
bioRxiv. 2025 Aug 28:2025.08.25.672130. doi: 10.1101/2025.08.25.672130.
2
Varying Responses to Heat Stress and Salinization Between Benthic and Pelagic Riverine Microbial Communities.底栖和浮游河流微生物群落对热应激和盐渍化的不同响应
Environ Microbiol. 2025 Sep;27(9):e70173. doi: 10.1111/1462-2920.70173.
3
Ubiquitin chain variability directs substrates of the Tul1 ubiquitin ligase complex to different degradation pathways.

本文引用的文献

1
Illumina-based analysis of microbial community diversity.基于 Illumina 的微生物群落多样性分析。
ISME J. 2012 Jan;6(1):183-94. doi: 10.1038/ismej.2011.74. Epub 2011 Jun 16.
2
Generation of multimillion-sequence 16S rRNA gene libraries from complex microbial communities by assembling paired-end illumina reads.通过组装 Illumina 双端测序 reads 从复杂微生物群落中生成数百万个 16S rRNA 基因文库。
Appl Environ Microbiol. 2011 Jun;77(11):3846-52. doi: 10.1128/AEM.02772-10. Epub 2011 Apr 1.
3
Microbiome profiling by illumina sequencing of combinatorial sequence-tagged PCR products.
泛素链的变异性将Tul1泛素连接酶复合体的底物导向不同的降解途径。
J Cell Biol. 2025 Sep 1;224(9). doi: 10.1083/jcb.202312133. Epub 2025 Jul 22.
4
Reducing enteric methane emission in dairy goats: impact of dietary inclusions of quebracho tannin extract on ruminal microbiota.减少奶山羊肠道甲烷排放:添加柯拉索单宁提取物对瘤胃微生物群的影响。
Front Microbiol. 2025 Jul 7;16:1595924. doi: 10.3389/fmicb.2025.1595924. eCollection 2025.
5
Lactic acid bacteria as microbial cell factories for the in vivo delivery of therapeutic proteins as secretable TAT fusion products.乳酸菌作为微生物细胞工厂,用于将治疗性蛋白质作为可分泌的TAT融合产物进行体内递送。
J Biol Eng. 2025 Jul 17;19(1):65. doi: 10.1186/s13036-025-00538-4.
6
Replaying germinal center evolution on a quantified affinity landscape.在量化的亲和力景观上重演生发中心演变。
bioRxiv. 2025 Jun 5:2025.06.02.656870. doi: 10.1101/2025.06.02.656870.
7
The chemosynthetic biofilm microbiome of deep-sea hydrothermal vents across space and time.跨时空的深海热液喷口化学合成生物膜微生物群落
Environ Microbiome. 2025 Jul 14;20(1):88. doi: 10.1186/s40793-025-00738-x.
8
Epitope and HLA specificity of human TCRs against Plasmodium falciparum circumsporozoite protein.人类针对恶性疟原虫环子孢子蛋白的T细胞受体的表位与HLA特异性
J Exp Med. 2025 Sep 1;222(9). doi: 10.1084/jem.20250044. Epub 2025 Jul 10.
9
The diversity, dynamics, and culturability of bacterial and fungal communities present in warm-season pasture grass seeds.暖季型牧草种子中存在的细菌和真菌群落的多样性、动态变化及可培养性。
Front Microbiol. 2025 Jun 25;16:1621463. doi: 10.3389/fmicb.2025.1621463. eCollection 2025.
10
Farming System and Nematodes Affect the Rhizosphere Microbiome of Tropical Banana Plants.种植系统和线虫影响热带香蕉植株的根际微生物群落。
Environ Microbiol Rep. 2025 Aug;17(4):e70155. doi: 10.1111/1758-2229.70155.
Illumina 测序组合序列标记 PCR 产物进行微生物组分析。
PLoS One. 2010 Oct 26;5(10):e15406. doi: 10.1371/journal.pone.0015406.
4
BIPES, a cost-effective high-throughput method for assessing microbial diversity.BIPES,一种具有成本效益的高通量微生物多样性评估方法。
ISME J. 2011 Apr;5(4):741-9. doi: 10.1038/ismej.2010.160. Epub 2010 Oct 21.
5
Unlocking short read sequencing for metagenomics.解锁宏基因组学的短读测序。
PLoS One. 2010 Jul 28;5(7):e11840. doi: 10.1371/journal.pone.0011840.
6
Global patterns of 16S rRNA diversity at a depth of millions of sequences per sample.每个样本深度达到数百万条序列的 16S rRNA 多样性的全球模式。
Proc Natl Acad Sci U S A. 2011 Mar 15;108 Suppl 1(Suppl 1):4516-22. doi: 10.1073/pnas.1000080107. Epub 2010 Jun 3.
7
The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants.Sanger 测序的 FASTQ 文件格式,用于包含质量分数的序列,以及 Solexa/Illumina FASTQ 变体。
Nucleic Acids Res. 2010 Apr;38(6):1767-71. doi: 10.1093/nar/gkp1137. Epub 2009 Dec 16.
8
The Ribosomal Database Project: improved alignments and new tools for rRNA analysis.核糖体数据库项目:改进的比对方法及用于rRNA分析的新工具。
Nucleic Acids Res. 2009 Jan;37(Database issue):D141-5. doi: 10.1093/nar/gkn879. Epub 2008 Nov 12.
9
The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data.核糖体数据库项目(RDP-II):介绍myRDP空间和质量受控的公共数据。
Nucleic Acids Res. 2007 Jan;35(Database issue):D169-72. doi: 10.1093/nar/gkl889. Epub 2006 Nov 7.
10
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.Cd-hit:一个用于对大量蛋白质或核苷酸序列进行聚类和比较的快速程序。
Bioinformatics. 2006 Jul 1;22(13):1658-9. doi: 10.1093/bioinformatics/btl158. Epub 2006 May 26.