Beijing Institute of Petrochemical Technology, Beijing 102617, China.
Beijing Institute of Petrochemical Technology, Beijing 102617, China.
Comput Biol Chem. 2024 Aug;111:108098. doi: 10.1016/j.compbiolchem.2024.108098. Epub 2024 May 17.
Cell-penetrating peptides have attracted much attention for their ability to break through cell membrane barriers, which can improve drug bioavailability, reduce side effects, and promote the development of gene therapy. Traditional wet-lab prediction methods are time-consuming and costly, and computational methods provide a short-time and low-cost alternative. Still, the accuracy and reliability need to be further improved. To solve this problem, this study proposes a feature fusion-based prediction model, where the protein pre-trained language models ProtBERT and ESM-2 are used as feature extractors, and the extracted features from both are fused to obtain a more comprehensive and effective feature representation, which is then predicted by linear mapping. Validated by many experiments on public datasets, the method has an AUC value as high as 0.983 and shows high accuracy and reliability in cell-penetrating peptide prediction.
细胞穿透肽因其能够突破细胞膜屏障的能力而备受关注,这可以提高药物的生物利用度,降低副作用,并促进基因治疗的发展。传统的湿实验室预测方法既耗时又昂贵,而计算方法提供了一种短期和低成本的替代方法。然而,准确性和可靠性仍需要进一步提高。为了解决这个问题,本研究提出了一种基于特征融合的预测模型,其中使用蛋白质预先训练的语言模型 ProtBERT 和 ESM-2 作为特征提取器,从两者中提取的特征进行融合,以获得更全面和有效的特征表示,然后通过线性映射进行预测。通过在公共数据集上进行的多项实验验证,该方法的 AUC 值高达 0.983,在细胞穿透肽预测方面表现出了很高的准确性和可靠性。