Suppr超能文献

使用生长自组织映射改进环境全基因组鸟枪法测序中的分箱过程。

Using growing self-organising maps to improve the binning process in environmental whole-genome shotgun sequencing.

作者信息

Chan Chon-Kit Kenneth, Hsu Arthur L, Tang Sen-Lin, Halgamuge Saman K

机构信息

Dynamic Systems & Control Group, Department of Mechanical Engineering, University of Melbourne, VIC 3010, Australia.

出版信息

J Biomed Biotechnol. 2008;2008:513701. doi: 10.1155/2008/513701.

Abstract

Metagenomic projects using whole-genome shotgun (WGS) sequencing produces many unassembled DNA sequences and small contigs. The step of clustering these sequences, based on biological and molecular features, is called binning. A reported strategy for binning that combines oligonucleotide frequency and self-organising maps (SOM) shows high potential. We improve this strategy by identifying suitable training features, implementing a better clustering algorithm, and defining quantitative measures for assessing results. We investigated the suitability of each of di-, tri-, tetra-, and pentanucleotide frequencies. The results show that dinucleotide frequency is not a sufficiently strong signature for binning 10 kb long DNA sequences, compared to the other three. Furthermore, we observed that increased order of oligonucleotide frequency may deteriorate the assignment result in some cases, which indicates the possible existence of optimal species-specific oligonucleotide frequency. We replaced SOM with growing self-organising map (GSOM) where comparable results are obtained while gaining 7%-15% speed improvement.

摘要

使用全基因组鸟枪法(WGS)测序的宏基因组项目会产生许多未组装的DNA序列和小的重叠群。基于生物学和分子特征对这些序列进行聚类的步骤称为分箱。一种将寡核苷酸频率和自组织映射(SOM)相结合的分箱策略显示出很高的潜力。我们通过识别合适的训练特征、实施更好的聚类算法以及定义评估结果的定量指标来改进这一策略。我们研究了二核苷酸、三核苷酸、四核苷酸和五核苷酸频率各自的适用性。结果表明,与其他三种相比,二核苷酸频率对于10 kb长的DNA序列分箱来说,不是一个足够强大的特征。此外,我们观察到在某些情况下,寡核苷酸频率阶数的增加可能会使分类结果变差,这表明可能存在最优的物种特异性寡核苷酸频率。我们用生长自组织映射(GSOM)取代了SOM,在获得可比结果的同时,速度提高了7%-15%。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2af2/2235928/2d983ebe1ca8/JBB2008-513701.001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验