Department of Biochemistry and Molecular Biology and Institute of Bioinformatics, Computational Systems Biology Laboratory, University of Georgia, Athens, GA 30602, USA.
Nucleic Acids Res. 2011 Dec;39(22):e150. doi: 10.1093/nar/gkr766. Epub 2011 Sep 29.
Existing methods for orthologous gene mapping suffer from two general problems: (i) they are computationally too slow and their results are difficult to interpret for automated large-scale applications when based on phylogenetic analyses; or (ii) they are too prone to making mistakes in dealing with complex situations involving horizontal gene transfers and gene fusion due to the lack of a sound basis when based on sequence similarity information. We present a novel algorithm, Global Optimization Strategy (GOST), for orthologous gene mapping through combining sequence similarity and contextual (working partners) information, using a combinatorial optimization framework. Genome-scale applications of GOST show substantial improvements over the predictions by three popular sequence similarity-based orthology mapping programs. Our analysis indicates that our algorithm overcomes the intrinsic issues faced by sequence similarity-based methods, when orthology mapping involves gene fusions and horizontal gene transfers. Our program runs as efficiently as the most efficient sequence similarity-based algorithm in the public domain. GOST is freely downloadable at http://csbl.bmb.uga.edu/~maqin/GOST.
(i) 基于系统发生分析的计算速度太慢,并且结果难以解释,不适合自动化的大规模应用;或 (ii) 基于序列相似性信息时,由于缺乏合理的基础,在处理涉及水平基因转移和基因融合的复杂情况时,很容易出错。我们提出了一种新的算法,全局优化策略 (GOST),通过结合序列相似性和上下文(工作伙伴)信息,使用组合优化框架进行直系同源基因映射。GOST 的基因组规模应用程序在基于三个流行的序列相似性直系同源映射程序的预测方面取得了显著的改进。我们的分析表明,当直系同源映射涉及基因融合和水平基因转移时,我们的算法克服了基于序列相似性方法所面临的内在问题。我们的程序的运行效率与公共领域中最有效的基于序列相似性的算法一样高效。GOST 可在 http://csbl.bmb.uga.edu/~maqin/GOST 免费下载。