Knisley Debra J, Knisley Jeff R
Department of Mathematics and Statistics, East Tennessee State University, Johnson City, TN 37614 ; Institute for Quantitative Biology, East Tennessee State University, Johnson City, TN 37614.
BMC Proc. 2014 Aug 28;8(Suppl 2 Proceedings of the 3rd Annual Symposium on Biologica):S7. doi: 10.1186/1753-6561-8-S2-S7. eCollection 2014.
We represent the protein structure of scTIM with a graph-theoretic model. We construct a hierarchical graph with three layers - a top level, a midlevel and a bottom level. The top level graph is a representation of the protein in which its vertices each represent a substructure of the protein. In turn, each substructure of the protein is represented by a graph whose vertices are amino acids. Finally, each amino acid is represented as a graph where the vertices are atoms. We use this representation to model the effects of a mutation on the protein.
There are 19 vertices (substructures) in the top level graph and thus there are 19 distinct graphs at the midlevel. The vertices of each of the 19 graphs at the midlevel represent amino acids. Each amino acid is represented by a graph where the vertices are atoms in the residue structure. All edges are determined by proximity in the protein's 3D structure. The vertices in the bottom level are labelled by the corresponding molecular mass of the atom that it represents. We use graph-theoretic measures that incorporate vertex weights to assign graph based attributes to the amino acid graphs. The attributes of the corresponding amino acids are used as vertex weights for the substructure graphs at the midlevel. Graph-theoretic measures based on vertex weighted graphs are subsequently calculated for each of the midlevel graphs. Finally, the vertices of the top level graph are weighted with attributes of the corresponding substructure graph in the midlevel.
We can visualize which mutations are more influential than others by using properties such as vertex size to correspond with an increase or decrease in a graph-theoretic measure. Global graph-theoretic measures such as the number of triangles or the number of spanning trees can change as the result. Hence this method provides a way to visualize these global changes resulting from a small, seemingly inconsequential local change.
This modelling method provides a novel approach to the visualization of protein structures and the consequences of amino acid deletions, insertions or substitutions and provides a new way to gain insight on the consequences of diseases caused by genetic mutations.
我们用一种图论模型来表示scTIM的蛋白质结构。我们构建了一个具有三层的层次图——顶层、中层和底层。顶层图是蛋白质的一种表示形式,其中每个顶点代表蛋白质的一个子结构。反过来,蛋白质的每个子结构由一个图表示,其顶点是氨基酸。最后,每个氨基酸表示为一个图,其顶点是原子。我们使用这种表示形式来模拟突变对蛋白质的影响。
顶层图中有19个顶点(子结构),因此中层有19个不同的图。中层的19个图中每个图的顶点代表氨基酸。每个氨基酸由一个图表示,其顶点是残基结构中的原子。所有边由蛋白质三维结构中的邻近性确定。底层的顶点由它所代表的原子的相应分子量标记。我们使用结合顶点权重的图论度量为氨基酸图分配基于图的属性。相应氨基酸的属性用作中层子结构图的顶点权重。随后为每个中层图计算基于顶点加权图的图论度量。最后,顶层图的顶点用中层相应子结构图的属性加权。
我们可以通过使用诸如顶点大小等属性来对应图论度量的增加或减少,从而可视化哪些突变比其他突变更具影响力。诸如三角形数量或生成树数量等全局图论度量可能会因此而改变。因此,这种方法提供了一种可视化由微小的、看似无关紧要的局部变化所导致的这些全局变化的方法。
这种建模方法为蛋白质结构可视化以及氨基酸缺失、插入或替换的后果提供了一种新颖的方法,并为深入了解由基因突变引起的疾病后果提供了一种新途径。