College of Information and Electrical Engineering, China Agricultural University, Beijing, 100083, China.
Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing, 100871, China.
BMC Bioinformatics. 2022 Feb 15;23(1):72. doi: 10.1186/s12859-022-04599-w.
The liquid-liquid phase separation (LLPS) of biomolecules in cell underpins the formation of membraneless organelles, which are the condensates of protein, nucleic acid, or both, and play critical roles in cellular function. Dysregulation of LLPS is implicated in a number of diseases. Although the LLPS of biomolecules has been investigated intensively in recent years, the knowledge of the prevalence and distribution of phase separation proteins (PSPs) is still lag behind. Development of computational methods to predict PSPs is therefore of great importance for comprehensive understanding of the biological function of LLPS.
Based on the PSPs collected in LLPSDB, we developed a sequence-based prediction tool for LLPS proteins (PSPredictor), which is an attempt at general purpose of PSP prediction that does not depend on specific protein types. Our method combines the componential and sequential information during the protein embedding stage, and, adopts the machine learning algorithm for final predicting. The proposed method achieves a tenfold cross-validation accuracy of 94.71%, and outperforms previously reported PSPs prediction tools. For further applications, we built a user-friendly PSPredictor web server ( http://www.pkumdl.cn/PSPredictor ), which is accessible for prediction of potential PSPs.
PSPredictor could identifie novel scaffold proteins for stress granules and predict PSPs candidates in the human genome for further study. For further applications, we built a user-friendly PSPredictor web server ( http://www.pkumdl.cn/PSPredictor ), which provides valuable information for potential PSPs recognition.
生物分子的液-液相分离(LLPS)是细胞无膜细胞器形成的基础,无膜细胞器是蛋白质、核酸或两者的凝聚物,在细胞功能中起着关键作用。LLPS 的失调与许多疾病有关。尽管近年来人们对生物分子的 LLPS 进行了深入研究,但对相分离蛋白(PSP)的普遍性和分布的了解仍然落后。因此,开发预测 PSP 的计算方法对于全面了解 LLPS 的生物学功能非常重要。
基于 LLPSDB 中收集的 PSP,我们开发了一种基于序列的 LLPS 蛋白预测工具(PSPredictor),这是一种尝试用于 PSP 预测的通用方法,不依赖于特定的蛋白质类型。我们的方法在蛋白质嵌入阶段结合了组成和序列信息,并采用机器学习算法进行最终预测。所提出的方法在十折交叉验证中的准确率达到 94.71%,优于以前报道的 PSP 预测工具。为了进一步的应用,我们构建了一个用户友好的 PSPredictor 网络服务器(http://www.pkumdl.cn/PSPredictor),可用于预测潜在的 PSP。
PSPredictor 可以识别应激颗粒的新型支架蛋白,并预测人类基因组中 PSP 的候选物,以进行进一步研究。为了进一步的应用,我们构建了一个用户友好的 PSPredictor 网络服务器(http://www.pkumdl.cn/PSPredictor),为潜在 PSP 的识别提供了有价值的信息。