Suppr超能文献

用于编辑下一代测序宏基因组数据的计算机算法的开发

Development of Computer Algorithm for Editing of Next Generation Sequencing Metagenome Data.

作者信息

Khanna Radhika, Mittal Sangeeta, Mohanty Sujata

机构信息

1 Department of Biotechnology, Jaypee Institute of Information Technology , Noida, India .

2 Department of Computer Science and Information Technology, Jaypee Institute of Information Technology , Noida, India .

出版信息

J Comput Biol. 2017 Sep;24(9):882-894. doi: 10.1089/cmb.2016.0179. Epub 2017 Jun 20.

Abstract

The successful implementation of the advanced sequencing technology, the next generation sequencing (NGS) motivates scientists from diverse fields of biological research especially from genomics and transcriptomics in generating large genomic data set to make their analysis more robust and come up with strong inference. However, exploiting this huge genomic data set becomes a challenge for the molecular biologists. To corroborate this problem, computational software and hardware are being developed in parallel and become an integral part of life science. While executing the "Genomics project of Indian Drosophila species," we found strings of Ns in the whole genome sequences generated on Illumina platform. The present article aims at developing a computer algorithm (MATLAB and Python based) for editing raw sequences mainly eliminating bad residues before submitting to the publicly accessible sequence repository. These algorithms will be helpful to life scientists for analyzing large amount of biological data in short span of time.

摘要

先进测序技术——下一代测序(NGS)的成功实施,激励着生物研究各个领域的科学家,尤其是基因组学和转录组学领域的科学家,去生成大量基因组数据集,以使他们的分析更加可靠,并得出有力的推断。然而,对于分子生物学家来说,利用这一庞大的基因组数据集成为了一项挑战。为证实这一问题,计算软件和硬件正在并行开发,并成为生命科学不可或缺的一部分。在执行“印度果蝇物种基因组计划”时,我们在Illumina平台上生成的全基因组序列中发现了一连串的N。本文旨在开发一种计算机算法(基于MATLAB和Python),用于编辑原始序列,主要是在提交到公共可用序列库之前消除不良残基。这些算法将有助于生命科学家在短时间内分析大量生物数据。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验