Suppr超能文献

绘制不同聚类方法在解读细菌嵌合体方面的优势与劣势

Mapping Strengths and Weaknesses of Different Clustering Approaches to Deciphering Bacterial Chimerism.

作者信息

Burks David J, Azad Rajeev K

机构信息

Department of Biological Sciences, BioDiscovery Institute, University of North Texas, Denton, Texas, USA.

Department of Mathematics, University of North Texas, Denton, Texas, USA.

出版信息

OMICS. 2022 Aug;26(8):422-439. doi: 10.1089/omi.2022.0062. Epub 2022 Aug 4.

Abstract

Bacterial genomes are chimeras of DNA of different ancestries. Deconstructing chimeric genomes is central to understanding the evolutionary trajectories of their disparate components and thus the organisms as a whole in the light of their evolutionary contexts. Of specific interest is to delineate and quantify native (vertically inherited) and alien (horizontally acquired) components of bacterial genomes and also specify genomic fractions that represent different donor sources. An agglomerative clustering procedure that prioritizes grouping of proximal similar genomic segments has previously been invoked for this purpose in conjunction with a recursive segmentation procedure. Surprisingly, however, the relative strengths and weaknesses of different clustering approaches to deciphering bacterial chimerism have not yet been investigated, despite the need to robustly interpret tens of thousands of completely sequenced bacterial genomes and nearly complete genome assemblies available in the public databases. To bridge this knowledge gap and develop more robust approaches, we assessed different clustering methods, including segment order based (proximal) clustering, hierarchical clustering, affinity propagation clustering, and a novel network clustering approach on chimeric genomes modeled after bacterial genomes representing a broad spectrum of compositional complexity. Although segment order-based clustering and network clustering compared favorably with the other approaches in discriminating between native and alien DNA at genome optimized settings, network clustering did consistently better than other methods at parametric settings optimized on all test genomes together. Segment order-based clustering and hierarchical clustering outperformed other methods in alien DNA identification while preserving donor identity in the genomes. Our study highlights the strengths and weaknesses of different approaches and suggests how this can be leveraged to achieve a more robust deconstruction of bacterial chimerism.

摘要

细菌基因组是不同谱系DNA的嵌合体。解构嵌合基因组对于理解其不同组成部分的进化轨迹至关重要,从而根据进化背景理解整个生物体。特别有趣的是描绘和量化细菌基因组的天然(垂直遗传)和外来(水平获得)成分,并确定代表不同供体来源的基因组片段。此前曾为此目的调用一种凝聚聚类程序,该程序优先对近端相似的基因组片段进行分组,并结合递归分割程序。然而,令人惊讶的是,尽管需要对公共数据库中数以万计的完全测序细菌基因组和几乎完整的基因组组装进行可靠解释,但尚未研究不同聚类方法在解读细菌嵌合现象方面的相对优缺点。为了填补这一知识空白并开发更强大的方法,我们评估了不同的聚类方法,包括基于片段顺序(近端)的聚类、层次聚类、亲和传播聚类,以及一种针对代表广泛组成复杂性的细菌基因组建模的嵌合基因组的新型网络聚类方法。尽管在基因组优化设置下,基于片段顺序的聚类和网络聚类在区分天然和外来DNA方面比其他方法表现更优,但在所有测试基因组一起优化的参数设置下,网络聚类始终比其他方法表现更好。基于片段顺序的聚类和层次聚类在识别外来DNA方面优于其他方法,同时在基因组中保留供体身份。我们的研究突出了不同方法的优缺点,并提出了如何利用这些优缺点来实现对细菌嵌合现象更可靠的解构。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验