Suppr超能文献

保护你的皮肤:一种集成联合特征的高精度长短期记忆网络,用于预测化学物质引起的皮肤刺激。

Protecting your skin: a highly accurate LSTM network integrating conjoint features for predicting chemical-induced skin irritation.

作者信息

Duy Huynh Anh, Srisongkram Tarapong

机构信息

Graduate School in the Program of Research and Development in Pharmaceuticals, Faculty of Pharmaceutical Sciences, Khon Kaen University, Khon Kaen, 40002, Thailand.

Division of Pharmaceutical Chemistry, Faculty of Pharmaceutical Sciences, Khon Kaen University, Khon Kaen, 40002, Thailand.

出版信息

J Cheminform. 2025 Mar 27;17(1):39. doi: 10.1186/s13321-025-00980-y.

Abstract

Skin irritation is a significant adverse effect associated with chemicals and drug substances. Quantitative structure-activity relationship (QSAR) is an alternative method bypassing in vivo assay for filling data gaps in chemical risk assessment. In this study, we developed QSAR models based on recurrent neural networks (RNNs) to classify skin irritation caused by chemical compounds. We utilized chemical language notation, molecular substructures, molecular descriptors, and a combination of these features named conjoint fingerprints for model construction. A simple RNN, long short-term memory (LSTM), bidirectional long short-term memory (BiLSTM), gated recurrent units (GRU), and bidirectional gated recurrent units (BiGRU) architectures were used to build the QSAR models. We found that the LSTM and a combination of molecular fingerprints and descriptors outperformed the other models significantly with 80% accuracy, 60% MCC, and 85% AUC for the external test set evaluation. Thereby, we selected this model for generalizability testing with other test sets beyond our study, ensuring that the model can be used with other data sets. Furthermore, the applicability domain of the purposed model was developed, enabling a trustable prediction will be made for a test compound. This model was developed based on OECD guidelines for skin irritation assessment and QSAR model development, assuring compliance with all required standards. The models and source codes developed in this study are publicly available, facilitating chemical design and safety evaluation, particularly for assessing the skin irritation potential of chemicals.

摘要

皮肤刺激是与化学物质和药物相关的一种显著不良反应。定量构效关系(QSAR)是一种绕过体内试验的替代方法,用于填补化学风险评估中的数据空白。在本研究中,我们基于循环神经网络(RNN)开发了QSAR模型,以对化合物引起的皮肤刺激进行分类。我们利用化学语言表示法、分子子结构、分子描述符以及这些特征的组合(称为联合指纹)来构建模型。使用简单RNN、长短期记忆(LSTM)、双向长短期记忆(BiLSTM)、门控循环单元(GRU)和双向门控循环单元(BiGRU)架构来构建QSAR模型。我们发现,对于外部测试集评估,LSTM以及分子指纹和描述符的组合表现明显优于其他模型,准确率达到80%,马修斯相关系数(MCC)为60%,曲线下面积(AUC)为85%。因此,我们选择该模型在本研究之外的其他测试集上进行泛化测试,以确保该模型可用于其他数据集。此外,还开发了目标模型的适用域,从而能够对测试化合物进行可靠的预测。该模型是根据经合组织(OECD)皮肤刺激评估和QSAR模型开发指南开发的,确保符合所有要求的标准。本研究中开发的模型和源代码可公开获取,有助于化学设计和安全评估,特别是用于评估化学品的皮肤刺激潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/eb6d/11951793/42c8f41c168f/13321_2025_980_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验