Suppr超能文献

降噪重复序列发现工具:在易错长读测序数据中发现串联重复序列。

Noise-cancelling repeat finder: uncovering tandem repeats in error-prone long-read sequencing data.

机构信息

Department of Biology, The Pennsylvania State University, State College, PA 16802, USA.

Center for Medical Genomics, The Pennsylvania State University, State College, PA 16802, USA.

出版信息

Bioinformatics. 2019 Nov 1;35(22):4809-4811. doi: 10.1093/bioinformatics/btz484.

Abstract

SUMMARY

Tandem DNA repeats can be sequenced with long-read technologies, but cannot be accurately deciphered due to the lack of computational tools taking high error rates of these technologies into account. Here we introduce Noise-Cancelling Repeat Finder (NCRF) to uncover putative tandem repeats of specified motifs in noisy long reads produced by Pacific Biosciences and Oxford Nanopore sequencers. Using simulations, we validated the use of NCRF to locate tandem repeats with motifs of various lengths and demonstrated its superior performance as compared to two alternative tools. Using real human whole-genome sequencing data, NCRF identified long arrays of the (AATGG)n repeat involved in heat shock stress response.

AVAILABILITY AND IMPLEMENTATION

NCRF is implemented in C, supported by several python scripts, and is available in bioconda and at https://github.com/makovalab-psu/NoiseCancellingRepeatFinder.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

摘要

串联 DNA 重复序列可以使用长读长技术进行测序,但由于缺乏考虑这些技术高错误率的计算工具,因此无法准确破译。在这里,我们介绍了噪声消除重复序列发现工具(Noise-Cancelling Repeat Finder,NCRF),用于在 Pacific Biosciences 和 Oxford Nanopore 测序器产生的嘈杂长读段中发现指定基序的假定串联重复序列。通过模拟,我们验证了 NCRF 用于定位具有各种长度基序的串联重复序列的用途,并证明了它与两种替代工具相比具有更好的性能。使用真实的人类全基因组测序数据,NCRF 鉴定了与热休克应激反应相关的(AATGG)n 重复长阵列。

可用性和实施

NCRF 是用 C 语言实现的,支持几个 Python 脚本,并在 bioconda 和 https://github.com/makovalab-psu/NoiseCancellingRepeatFinder 上提供。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/483f/6853708/c983cca2e0fc/btz484f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验