Hampson S E, Gaut B S, Baldi P
School of Information and Computer Science, University of California, Irvine, Irvine, CA 92697, USA.
Bioinformatics. 2005 Apr 15;21(8):1339-48. doi: 10.1093/bioinformatics/bti168. Epub 2004 Dec 7.
Over evolutionary time, various processes including point mutations and insertions, deletions and inversions of variable sized segments progressively degrade the homology of duplicated chromosomal regions making identification of the homologous regions correspondingly difficult. Existing algorithms that attempt to detect homology are based on shared-gene density and colinearity and possibly also strand information.
Here, we develop a new algorithm for the statistical detection of chromosomal homology, CloseUp, which uses shared-gene density alone to fully exploit the observation that relaxing colinearity requirements in general is beneficial for homology detection and at the same time optimizes computation time. CloseUp has two components: the identification of candidate homologous regions followed by their statistical evaluation using Monte Carlo methods and data randomization. Using both artificial and real data, we compared CloseUp with two existing programs (ADHoRe and LineUp) for chromosomal homology detection and found that in general CloseUp compares favorably.
CloseUp and supplementary information are available at http://www.igb.uci.edu/servers/cgss.html
在进化过程中,包括点突变、可变大小片段的插入、缺失和倒位等各种过程会逐渐降低重复染色体区域的同源性,从而相应地增加了同源区域识别的难度。现有的试图检测同源性的算法是基于共享基因密度、共线性以及可能的链信息。
在此,我们开发了一种用于染色体同源性统计检测的新算法CloseUp,它仅使用共享基因密度,充分利用了这样一种观察结果,即一般来说放宽共线性要求有利于同源性检测,同时优化了计算时间。CloseUp有两个组成部分:候选同源区域的识别,随后使用蒙特卡罗方法和数据随机化对其进行统计评估。使用人工数据和真实数据,我们将CloseUp与两个现有的用于染色体同源性检测的程序(ADHoRe和LineUp)进行了比较,发现总体上CloseUp表现更优。
可在http://www.igb.uci.edu/servers/cgss.html获取CloseUp及补充信息。