Collins Krista, Gu Hong, Field Chris
Dalhousie University.
Stat Appl Genet Mol Biol. 2006;5:Article23. doi: 10.2202/1544-6115.1231. Epub 2006 Sep 17.
The spectral envelope, a frequency based technique for analyzing categorical time series, is applied to amino acid sequences to examine their periodicity. The periodic signatures of such sequences is related to the secondary structure of the folding patterns in the gene. For a pair of sequences, we define a spectral envelope covariance which emphasizes the common periodicities in the two sequences. This is used to give a similarity measure for the two sequences which can then be used in a neighbour joining algorithm to construct a phylogeny. We apply the spectral methods to myoglobin sequences from primates and cetaceans. The spectral envelope reflects the structure of this protein and the tree constructed using spectral methods shows strong agreement with published trees. The spectral envelope can be used to explore similarities between and within different protein families. Since we do not require aligned sequences, the spectral methods can be used to create phylogenies across different protein families. We apply the method to 11 protein families from PANDIT obtaining a tree where the families are separated and the relationship among the families is given.
谱包络是一种基于频率分析分类时间序列的技术,应用于氨基酸序列以检验其周期性。此类序列的周期性特征与基因折叠模式的二级结构相关。对于一对序列,我们定义了一种谱包络协方差,它强调了两条序列中的共同周期性。这用于给出两条序列的相似性度量,然后可用于邻接法构建系统发育树。我们将谱方法应用于灵长类动物和鲸类动物的肌红蛋白序列。谱包络反映了这种蛋白质的结构,使用谱方法构建的树与已发表的树显示出很强的一致性。谱包络可用于探索不同蛋白质家族之间及内部的相似性。由于我们不需要比对序列,谱方法可用于跨不同蛋白质家族创建系统发育树。我们将该方法应用于来自PANDIT的11个蛋白质家族,得到一棵将各家族分开并给出家族间关系的树。