Université de Paris, Inserm UMR_S 1134 - BIGR, INTS, 6 rue Alexandre Cabanel, 75015 Paris, France; Laboratoire d'Excellence GR-Ex, 75015 Paris, France.
Université de Paris, Inserm UMR_S 1134 - BIGR, INTS, 6 rue Alexandre Cabanel, 75015 Paris, France; Laboratoire d'Excellence GR-Ex, 75015 Paris, France.
J Mol Biol. 2021 May 28;433(11):166882. doi: 10.1016/j.jmb.2021.166882. Epub 2021 Feb 20.
Information on the protein flexibility is essential to understand crucial molecular mechanisms such as protein stability, interactions with other molecules and protein functions in general. B-factor obtained in the X-ray crystallography experiments is the most common flexibility descriptor available for the majority of the resolved protein structures. Since the gap between the number of the resolved protein structures and available protein sequences is continuously growing, it is important to provide computational tools for protein flexibility prediction from amino acid sequence. In the current study, we report a Deep Learning based protein flexibility prediction tool MEDUSA (https://www.dsimb.inserm.fr/MEDUSA). MEDUSA uses evolutionary information extracted from protein homologous sequences and amino acid physico-chemical properties as input for a convolutional neural network to assign a flexibility class to each protein sequence position. Trained on a non-redundant dataset of X-ray structures, MEDUSA provides flexibility prediction in two, three and five classes. MEDUSA is freely available as a web-server providing a clear visualization of the prediction results as well as a standalone utility (https://github.com/DSIMB/medusa). Analysis of the MEDUSA output allows a user to identify the potentially highly deformable protein regions and general dynamic properties of the protein.
蛋白质柔性的信息对于理解关键的分子机制至关重要,如蛋白质稳定性、与其他分子的相互作用以及蛋白质的一般功能。在 X 射线晶体学实验中获得的 B 因子是大多数已解析蛋白质结构中最常用的柔性描述符。由于已解析的蛋白质结构数量与可用的蛋白质序列之间的差距不断扩大,因此提供从氨基酸序列预测蛋白质柔性的计算工具非常重要。在本研究中,我们报告了一种基于深度学习的蛋白质柔性预测工具 MEDUSA(https://www.dsimb.inserm.fr/MEDUSA)。MEDUSA 使用从蛋白质同源序列中提取的进化信息和氨基酸物理化学性质作为输入,用于卷积神经网络,为每个蛋白质序列位置分配一个柔性类别。经过 X 射线结构的非冗余数据集训练,MEDUSA 提供了两类、三类和五类的柔性预测。MEDUSA 作为一个网络服务器免费提供,它提供了预测结果的清晰可视化,以及一个独立的实用程序(https://github.com/DSIMB/medusa)。对 MEDUSA 输出的分析允许用户识别潜在的高度可变形蛋白质区域和蛋白质的一般动态特性。