Suppr超能文献

一种使用目级抽样的细菌系统发育分析及利用网络科学鉴定水平基因转移的系统方法。

A Systematic Approach to Bacterial Phylogeny Using Order Level Sampling and Identification of HGT Using Network Science.

作者信息

Khaledian Ehdieh, Brayton Kelly A, Broschat Shira L

机构信息

School of Electrical Engineering and Computer Science, Washington State University, P.O. Box 642752, Pullman, WA 99164, USA.

Department of Veterinary Microbiology and Pathology, Washington State University, P.O. Box 647040, Pullman, WA 99164, USA.

出版信息

Microorganisms. 2020 Feb 24;8(2):312. doi: 10.3390/microorganisms8020312.

Abstract

Reconstructing and visualizing phylogenetic relationships among living organisms is a fundamental challenge because not all organisms share the same genes. As a result, the first phylogenetic visualizations employed a single gene, e.g., rRNA genes, sufficiently conserved to be present in all organisms but divergent enough to provide discrimination between groups. As more genome data became available, researchers began concatenating different combinations of genes or proteins to construct phylogenetic trees believed to be more robust because they incorporated more information. However, the genes or proteins chosen were based on ad hoc approaches. The large number of complete genome sequences available today allows the use of whole genomes to analyze relationships among organisms rather than using an ad hoc set of genes. We present a systematic approach for constructing a phylogenetic tree based on simultaneously clustering the complete proteomes of 360 bacterial species. From the homologous clusters, we identify 49 protein sequences shared by 99% of the organisms to build a tree. Of the 49 sequences, 47 have homologous sequences in both archaea and eukarya. The clusters are also used to create a network from which bacterial species with horizontally-transferred genes from other phyla are identified.

摘要

重建并可视化生物之间的系统发育关系是一项根本性挑战,因为并非所有生物都拥有相同的基因。因此,最初的系统发育可视化采用单个基因,例如rRNA基因,其保守性足以存在于所有生物中,但又具有足够的差异性以区分不同的类群。随着越来越多的基因组数据可用,研究人员开始串联不同的基因或蛋白质组合来构建系统发育树,他们认为这样的树更可靠,因为纳入了更多信息。然而,所选择的基因或蛋白质是基于临时方法。如今大量完整的基因组序列使得可以使用全基因组来分析生物之间的关系,而不是使用一组临时选定的基因。我们提出了一种系统方法,基于对360种细菌物种的完整蛋白质组进行同时聚类来构建系统发育树。从同源簇中,我们鉴定出99%的生物共有的49个蛋白质序列来构建一棵树。在这49个序列中,47个在古细菌和真核生物中都有同源序列。这些簇还用于创建一个网络,从中识别出具有从其他门类水平转移基因的细菌物种。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ed27/7074868/bb8e63374cd8/microorganisms-08-00312-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验