Suppr超能文献

基于基因组数值表征衍生参数的快速细菌物种划分

Rapid Bacterial Species Delineation Based on Parameters Derived From Genome Numerical Representations.

作者信息

Maderankova Denisa, Jugas Robin, Sedlar Karel, Vitek Martin, Skutkova Helena

机构信息

Department of Biomedical Engineering, Faculty of Electrical Engineering and Communication, Brno University of Technology, Technicka 12, 61600 Brno, Czech Republic.

出版信息

Comput Struct Biotechnol J. 2019 Jan 9;17:118-126. doi: 10.1016/j.csbj.2018.12.006. eCollection 2019.

Abstract

Species delineation based on bacterial genomes is an essential part of the research of prokaryotes. In silico genome-to-genome comparison methods are computationally demanding, but much less tedious and error prone than the wet-lab methods. In this paper, we present a novel method for the delineation of bacterial genomes based on genomic signal processing. The proposed method uses numerical representations of whole bacterial genomes, phase signal and cumulated phase signal, from which four parameters are derived for each genome. The parameters characterize a genome and their calculation is independent of the other genomes comprising a delineation dataset. The delineation itself is processed as a calculation of the parameters' average similarity. The method was statistically verified on 1826 bacterial genomes. A similarity threshold of 96% was set based on the receiver operating characteristic curve that featured sensitivity of 99.78% and specificity of 97.25%. Additionally, comparative analysis on another 33 bacterial genomes was conducted using standard delineation tools as these tools were not able to process the dataset of 1826 genomes using desktop computer. The proposed method achieved comparable or better delineation results in comparison with the standard tools. Besides the excellent delineation results, another great advantage of the method is its small computational demands, which enables the delineation of thousands of genomes on a desktop computer. The calculation of the parameters takes tens of minutes for thousands of genomes. Moreover, they can be calculated in advance by creating a database, meaning the delineation itself is then completed in a matter of seconds.

摘要

基于细菌基因组的物种划分是原核生物研究的重要组成部分。电子计算机上的全基因组比较方法对计算要求很高,但比湿实验室方法乏味程度低得多且出错概率小。在本文中,我们提出了一种基于基因组信号处理的细菌基因组划分新方法。所提出的方法使用整个细菌基因组的数值表示、相位信号和累积相位信号,从中为每个基因组导出四个参数。这些参数表征一个基因组,并且它们的计算独立于构成划分数据集的其他基因组。划分本身作为参数平均相似度的计算来处理。该方法在1826个细菌基因组上进行了统计验证。基于具有99.78%的灵敏度和97.25%的特异性的受试者工作特征曲线设置了96%的相似度阈值。此外,由于标准划分工具无法使用台式计算机处理1826个基因组的数据集,因此使用标准划分工具对另外33个细菌基因组进行了比较分析。与标准工具相比,所提出的方法获得了相当或更好的划分结果。除了出色的划分结果外,该方法的另一个巨大优势是其计算要求低,这使得在台式计算机上能够对数千个基因组进行划分。对于数千个基因组,参数的计算需要数十分钟。此外,可以通过创建数据库预先计算它们,这意味着划分本身随后在几秒钟内即可完成。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0c55/6352304/d5ca8927ae9b/gr1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验