Innovative Institute of Chinese Medicine and Pharmacy, Academy for Interdiscipline, Chengdu University of Traditional Chinese Medicine, Chengdu 611137, China.
School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China.
Int J Biol Macromol. 2024 Apr;264(Pt 2):130638. doi: 10.1016/j.ijbiomac.2024.130638. Epub 2024 Mar 7.
The rational modification of siRNA molecules is crucial for ensuring their drug-like properties. Machine learning-based prediction of chemically modified siRNA (cm-siRNA) efficiency can significantly optimize the design process of siRNA chemical modifications, saving time and cost in siRNA drug development. However, existing in-silico methods suffer from limitations such as small datasets, inadequate data representation capabilities, and lack of interpretability. Therefore, in this study, we developed the Cm-siRPred algorithm based on a multi-view learning strategy. The algorithm employs a multi-view strategy to represent the double-strand sequences, chemical modifications, and physicochemical properties of cm-siRNA. It incorporates a cross-attention model to globally correlate different representation vectors and a two-layer CNN module to learn local correlation features. The algorithm demonstrates exceptional performance in cross-validation experiments, independent dataset, and case studies on approved siRNA drugs, and showcasing its robustness and generalization ability. In addition, we developed a user-friendly webserver that enables efficient prediction of cm-siRNA efficiency and assists in the design of siRNA drug chemical modifications. In summary, Cm-siRPred is a practical tool that offers valuable technical support for siRNA chemical modification and drug efficiency research, while effectively assisting in the development of novel small nucleic acid drugs. Cm-siRPred is freely available at https://cellknowledge.com.cn/sirnapredictor/.
siRNA 分子的合理修饰对于确保其药物特性至关重要。基于机器学习的化学修饰 siRNA(cm-siRNA)效率预测可以显著优化 siRNA 化学修饰的设计过程,节省 siRNA 药物开发的时间和成本。然而,现有的计算方法存在数据集小、数据表示能力不足和缺乏可解释性等局限性。因此,在本研究中,我们开发了基于多视图学习策略的 Cm-siRPred 算法。该算法采用多视图策略来表示 cm-siRNA 的双链序列、化学修饰和物理化学性质。它结合了交叉注意模型来全局关联不同的表示向量,以及两层 CNN 模块来学习局部相关特征。该算法在交叉验证实验、独立数据集和已批准的 siRNA 药物案例研究中表现出卓越的性能,展示了其稳健性和泛化能力。此外,我们开发了一个用户友好的网络服务器,能够有效地预测 cm-siRNA 的效率,并协助设计 siRNA 药物的化学修饰。总之,Cm-siRPred 是一个实用的工具,为 siRNA 化学修饰和药物效率研究提供了有价值的技术支持,同时有效地协助新型小核酸药物的开发。Cm-siRPred 可在 https://cellknowledge.com.cn/sirnapredictor/ 免费获取。