Suppr超能文献

基于后缀树算法的同源共线性块检测

Homologous synteny block detection based on suffix tree algorithms.

作者信息

Chen Yu-Lun, Chen Chien-Ming, Pai Tun-Wen, Leong Hon-Wai, Chong Ket-Fah

机构信息

Department of Computer Science and Engineering and Center of Excellence for the Oceans, National Taiwan Ocean University, No. 2 Peining Road, Keelung, Taiwan 20224, Republic of China.

出版信息

J Bioinform Comput Biol. 2013 Dec;11(6):1343004. doi: 10.1142/S021972001343004X. Epub 2013 Dec 2.

Abstract

A synteny block represents a set of contiguous genes located within the same chromosome and well conserved among various species. Through long evolutionary processes and genome rearrangement events, large numbers of synteny blocks remain highly conserved across multiple species. Understanding distribution of conserved gene blocks facilitates evolutionary biologists to trace the diversity of life, and it also plays an important role for orthologous gene detection and gene annotation in the genomic era. In this work, we focus on collinear synteny detection in which the order of genes is required and well conserved among multiple species. To achieve this goal, the suffix tree based algorithms for efficiently identifying homologous synteny blocks was proposed. The traditional suffix tree algorithm was modified by considering a chromosome as a string and each gene in a chromosome is encoded as a symbol character. Hence, a suffix tree can be built for different query chromosomes from various species. We can then efficiently search for conserved synteny blocks that are modeled as overlapped contiguous edges in our suffix tree. In addition, we defined a novel Synteny Block Conserved Index (SBCI) to evaluate the relationship of synteny block distribution between two species, and which could be applied as an evolutionary indicator for constructing a phylogenetic tree from multiple species instead of performing large computational requirements through whole genome sequence alignment.

摘要

一个共线性块代表位于同一条染色体上且在不同物种间高度保守的一组相邻基因。经过漫长的进化过程和基因组重排事件,大量的共线性块在多个物种间仍保持高度保守。了解保守基因块的分布有助于进化生物学家追踪生命的多样性,在基因组时代,它对于直系同源基因检测和基因注释也起着重要作用。在这项工作中,我们专注于共线共线性检测,其中基因顺序是必需的且在多个物种间高度保守。为实现这一目标,提出了基于后缀树的算法来高效识别同源共线性块。通过将染色体视为一个字符串,并将染色体中的每个基因编码为一个符号字符,对传统后缀树算法进行了修改。因此,可以为来自不同物种的不同查询染色体构建后缀树。然后,我们可以在后缀树中高效搜索被建模为重叠相邻边的保守共线性块。此外,我们定义了一种新颖的共线性块保守指数(SBCI)来评估两个物种之间共线性块分布的关系,并且它可以用作从多个物种构建系统发育树的进化指标,而无需通过全基因组序列比对进行大量计算。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验