Department of Systems and Computational Biology, Albert Einstein College of Medicine , 1300 Morris Park Avenue, Bronx, New York 10461, United States.
J Chem Inf Model. 2018 Feb 26;58(2):532-542. doi: 10.1021/acs.jcim.7b00580. Epub 2018 Jan 30.
Domains that belong to an immunoglobulin (Ig) fold are extremely abundant in cell surface receptors, which play significant roles in cell-cell adhesion and signaling. Although the structures of domains in an Ig fold share common topology of β-barrels, functions of receptors in adhesion and signaling are regulated by the very heterogeneous binding between these domains. Additionally, only a small number of domains are directly involved in the binding between two multidomain receptors. It is challenging and time consuming to experimentally detect the binding partners of a given receptor and further determine which specific domains in this receptor are responsible for binding. Therefore, current knowledge in the binding mechanism of Ig-fold domains and their impacts on cell adhesion and signaling is very limited. A bioinformatics study can shed light on this topic from a systematic point of view. However, there is so far no computational analysis on the structural and functional characteristics of the entire Ig fold. We constructed nonredundant structural data sets for all domains in Ig fold, depending on their functions in cell adhesion and signaling. We found that data sets of domains in adhesion receptors show different binding preference from domains in signaling receptors. Using structural alignment, we further built a common structural template for each group of a domain data set. By mapping the protein-protein binding interface of each domain in a group onto the surface of its structural template, we found binding interfaces are highly overlapped within each specific group. These overlapped interfaces, we called consensus binding interfaces, are distinguishable among different data sets of domains. Finally, the residue compositions on the consensus interfaces were used as indicators for multiple machine learning algorithms to predict if they can form homotypic interactions with each other. The overall performance of the cross-validation tests shows that our prediction accuracies ranged between 0.6 and 0.8.
属于免疫球蛋白(Ig)折叠的结构域在细胞表面受体中极为丰富,这些受体在细胞-细胞黏附和信号转导中发挥着重要作用。尽管 Ig 折叠结构域的结构具有β-桶的共同拓扑结构,但黏附和信号转导受体的功能是通过这些结构域之间非常异质的结合来调节的。此外,只有少数结构域直接参与两个多结构域受体之间的结合。实验检测给定受体的结合伴侣并进一步确定该受体中哪些特定结构域负责结合是具有挑战性和耗时的。因此,目前对于 Ig 折叠结构域的结合机制及其对细胞黏附和信号转导的影响的了解非常有限。从系统的角度来看,生物信息学研究可以阐明这个问题。然而,到目前为止,还没有针对整个 Ig 折叠的结构和功能特征的计算分析。我们根据它们在细胞黏附和信号转导中的功能,为 Ig 折叠中的所有结构域构建了非冗余的结构数据集。我们发现,黏附受体中的结构域数据集显示出与信号转导受体中的结构域数据集不同的结合偏好。通过结构比对,我们进一步为每个结构域数据集的组构建了一个共同的结构模板。通过将每个组中的结构域的蛋白质-蛋白质结合界面映射到其结构模板的表面上,我们发现结合界面在每个特定组内高度重叠。这些重叠的界面,我们称之为共识结合界面,在不同的结构域数据集之间是可区分的。最后,使用共识界面上的残基组成作为多个机器学习算法的指标,以预测它们是否可以彼此形成同型相互作用。交叉验证测试的整体性能表明,我们的预测准确率在 0.6 到 0.8 之间。