Roach Jeffrey, Sharma Shantanu, Kapustina Maryna, Carter Charles W
Department of Biochemistry and Biophysics, University of North Carolina, Chapel Hill, North Carolina 27599, USA.
Proteins. 2005 Jul 1;60(1):66-81. doi: 10.1002/prot.20479.
A novel protein structure alignment technique has been developed reducing much of the secondary and tertiary structure to a sequential representation greatly accelerating many structural computations, including alignment. Constructed from incidence relations in the Delaunay tetrahedralization, alignments of the sequential representation describe structural similarities that cannot be expressed with rigid-body superposition and complement existing techniques minimizing root-mean-squared distance through superposition. Restricting to the largest substructure superimposable by a single rigid-body transformation determines an alignment suitable for root-mean-squared distance comparisons and visualization. Restricted alignments of a test set of histones and histone-like proteins determined superpositions nearly identical to those produced by the established structure alignment routines of DaliLite and ProSup. Alignment of three, increasingly complex proteins: ferredoxin, cytidine deaminase, and carbamoyl phosphate synthetase, to themselves, demonstrated previously identified regions of self-similarity. All-against-all similarity index comparisons performed on a test set of 45 class I and class II aminoacyl-tRNA synthetases closely reproduced the results of established distance matrix methods while requiring 1/16 the time. Principal component analysis of pairwise tetrahedral decomposition similarity of 2300 molecular dynamics snapshots of tryptophanyl-tRNA synthetase revealed discrete microstates within the trajectory consistent with experimental results. The method produces results with sufficient efficiency for large-scale multiple structure alignment and is well suited to genomic and evolutionary investigations where no geometric model of similarity is known a priori.
一种新颖的蛋白质结构比对技术已被开发出来,它将大部分二级和三级结构简化为序列表示,极大地加速了包括比对在内的许多结构计算。该序列表示比对由德劳内四面体剖分中的关联关系构建而成,描述了无法用刚体叠加来表达的结构相似性,并且补充了通过叠加使均方根距离最小化的现有技术。限制在通过单个刚体变换可叠加的最大子结构上,可确定适合均方根距离比较和可视化的比对。对一组组蛋白和类组蛋白的测试集进行的受限比对确定的叠加几乎与由DaliLite和ProSup既定的结构比对程序产生的叠加相同。将三种越来越复杂的蛋白质:铁氧化还原蛋白、胞苷脱氨酶和氨甲酰磷酸合成酶与其自身进行比对,展示了先前确定的自相似区域。在一组45种I类和II类氨酰 - tRNA合成酶的测试集上进行的全对全相似性指数比较,紧密重现了既定距离矩阵方法的结果,同时所需时间仅为其1/16。对色氨酰 - tRNA合成酶的2300个分子动力学快照的成对四面体分解相似性进行主成分分析,揭示了轨迹内与实验结果一致的离散微状态。该方法对于大规模多结构比对具有足够的效率,非常适合于事先不知道相似性几何模型的基因组和进化研究。