Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy.
Dept. of Biomedical Sciences, University of Padua, Via Ugo Bassi 58/B, Padua 35121, Italy.
J Mol Biol. 2021 Apr 30;433(9):166900. doi: 10.1016/j.jmb.2021.166900. Epub 2021 Feb 27.
A large fraction of peptides or protein regions are disordered in isolation and fold upon binding. These regions, also called MoRFs, SLiMs or LIPs, are often associated with signaling and regulation processes. However, despite their importance, only a limited number of examples are available in public databases and their automatic detection at the proteome level is problematic. Here we present FLIPPER, an automatic method for the detection of structurally linear sub-regions or peptides that interact with another chain in a protein complex. FLIPPER is a random forest classification that takes the protein structure as input and provides the propensity of each amino acid to be part of a LIP region. Models are built taking into consideration structural features such as intra- and inter-chain contacts, secondary structure, solvent accessibility in both bound and unbound state, structural linearity and chain length. FLIPPER is accurate when evaluated on non-redundant independent datasets, 99% precision and 99% sensitivity on PixelDB-25 and 87% precision and 88% sensitivity on DIBS-25. Finally, we used FLIPPER to process the entire Protein Data Bank and identified different classes of LIPs based on different binding modes and partner molecules. We provide a detailed description of these LIP categories and show that a large fraction of these regions are not detected by disorder predictors. All FLIPPER predictions are integrated in the MobiDB 4.0 database.
很大一部分肽或蛋白质区域在分离时是无规则的,只有在结合时才会折叠。这些区域也称为 MoRFs、SLiMs 或 LIPs,通常与信号转导和调节过程有关。然而,尽管它们很重要,但在公共数据库中只有有限数量的例子,并且在蛋白质组水平上自动检测它们是有问题的。在这里,我们提出了 FLIPPER,这是一种用于检测在蛋白质复合物中与另一条链相互作用的结构线性亚区域或肽的自动方法。FLIPPER 是一种随机森林分类器,它以蛋白质结构为输入,并提供每个氨基酸成为 LIP 区域一部分的倾向。模型的构建考虑了结构特征,如链内和链间接触、二级结构、结合态和非结合态的溶剂可及性、结构线性和链长。FLIPPER 在非冗余独立数据集上进行评估时具有准确性,在 PixelDB-25 上的精度为 99%,灵敏度为 99%,在 DIBS-25 上的精度为 87%,灵敏度为 88%。最后,我们使用 FLIPPER 处理了整个蛋白质数据库,并根据不同的结合模式和伴侣分子识别出不同类别的 LIPs。我们详细描述了这些 LIP 类别,并表明这些区域中有很大一部分没有被无序预测器检测到。所有的 FLIPPER 预测都集成在 MobiDB 4.0 数据库中。