Suppr超能文献

一种使用 Illumina 配对末端技术生成高质量短读段的过滤方法。

A filtering method to generate high quality short reads using illumina paired-end technology.

机构信息

Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, Woods Hole, Massachusetts, USA.

出版信息

PLoS One. 2013 Jun 17;8(6):e66643. doi: 10.1371/journal.pone.0066643. Print 2013.

Abstract

Consensus between independent reads improves the accuracy of genome and transcriptome analyses, however lack of consensus between very similar sequences in metagenomic studies can and often does represent natural variation of biological significance. The common use of machine-assigned quality scores on next generation platforms does not necessarily correlate with accuracy. Here, we describe using the overlap of paired-end, short sequence reads to identify error-prone reads in marker gene analyses and their contribution to spurious OTUs following clustering analysis using QIIME. Our approach can also reduce error in shotgun sequencing data generated from libraries with small, tightly constrained insert sizes. The open-source implementation of this algorithm in Python programming language with user instructions can be obtained from https://github.com/meren/illumina-utils.

摘要

独立读取之间的一致性可以提高基因组和转录组分析的准确性,然而在宏基因组研究中,非常相似的序列之间缺乏一致性通常可以而且确实代表了具有生物学意义的自然变异。在下一代平台上,机器分配的质量分数的常见使用并不一定与准确性相关。在这里,我们描述了使用成对的、短序列读取的重叠来识别标记基因分析中易错的读取,以及它们在使用 QIIME 进行聚类分析后对虚假 OTUs 的贡献。我们的方法还可以减少使用插入大小小且受严格限制的文库生成的鸟枪法测序数据中的错误。该算法的 Python 编程语言的开源实现以及用户说明可以从 https://github.com/meren/illumina-utils 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5d80/3684618/6bd96441d996/pone.0066643.g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验