Nandy Ashesh, Ghosh Ambarnil, Nandy Papiya
School of Environmental Studies, Jadavpur University, Kolkata, India.
In Silico Biol. 2009;9(3):77-87.
We propose a new method to compare sequences of protein families by generating numerical characterizations through a 20D representation. Using a walk along the axes representing the amino acids we generate a vector for each sequence whose components can be used to derive distance matrices between sequences and whose magnitudes can be used to compare the similarities/dissimilarities between the different sequences. The distance matrices enable creation of phylogenetic trees without need for multiple alignments or any other model dependencies. In this paper we test this technique with human globin gene sequences and then apply the method to a contemporary issue of evolutionary relationships of rat and human voltage-gated sodium channel alpha subunits and compare with published literature. The close match of the results demonstrates the reliability and ease of use of this method.
我们提出了一种新方法,通过20维表示生成数值特征来比较蛋白质家族的序列。沿着代表氨基酸的轴进行游走,我们为每个序列生成一个向量,其分量可用于推导序列之间的距离矩阵,其大小可用于比较不同序列之间的相似性/差异性。距离矩阵使得无需进行多重比对或任何其他模型依赖即可创建系统发育树。在本文中,我们用人血红蛋白基因序列测试了该技术,然后将该方法应用于大鼠和人类电压门控钠通道α亚基进化关系的当代问题,并与已发表的文献进行比较。结果的紧密匹配证明了该方法的可靠性和易用性。