• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于嵌入深度学习的异构化合物自然语言处理方法及其在定量构效关系建模中的应用。

A natural language processing approach based on embedding deep learning from heterogeneous compounds for quantitative structure-activity relationship modeling.

机构信息

Laboratoire de Synthèse et Biocatalyse Organique, Département de Chimie, Faculté des Sciences, Université Badji Mokhtar Annaba, Annaba, Algeria.

Laboratoire Bioinformatique, Centre de Recherche en Biotechnologie (CRBt), Constantine, Algeria.

出版信息

Chem Biol Drug Des. 2020 Sep;96(3):961-972. doi: 10.1111/cbdd.13742.

DOI:10.1111/cbdd.13742
PMID:33058460
Abstract

Over the past decade, rapid development in biological and chemical technologies such as high-throughput screening, parallel synthesis, has been significantly increased the amount of data, which requires the creation and the integration of new analytical methods, especially deep learning models. Recently, there is an increasing interest in deep learning utilization in computer-aided drug discovery due to its exceptional successful application in many fields. The present work proposed a natural language processing approach, based on embedding deep neural networks. Our method aims to transform the Simplified Molecular Input Line Entry System format into word embedding vectors to represent the semantics of compounds. These vectors are fed into supervised machine learning algorithms such as convolutional long short-term memory neural network, support vector machine, and random forest to build up quantitative structure-activity relationship models on toxicity data sets. The obtained results on toxicity data to the ciliate Tetrahymena pyriformis (IGC ), and acute toxicity rat data expressed as median lethal dose of treated rats (LD ) show that our approach can eventually be used to predict the activities of chemical compounds efficiently. All material used in this study is available online through the GitHub portal (https://github.com/BoukeliaAbdelbasset/NLPDeepQSAR.git).

摘要

在过去的十年中,高通量筛选、平行合成等生物技术和化学技术的快速发展,大大增加了数据量,这需要创建和整合新的分析方法,特别是深度学习模型。最近,由于深度学习在许多领域的成功应用,人们对其在计算机辅助药物发现中的应用越来越感兴趣。本工作提出了一种基于嵌入深度神经网络的自然语言处理方法。我们的方法旨在将简化分子输入行进入系统格式转换为单词嵌入向量,以表示化合物的语义。然后将这些向量输入到监督机器学习算法中,如卷积长短期记忆神经网络、支持向量机和随机森林,以建立毒性数据集上的定量构效关系模型。在对纤毛虫四膜虫(IGC)的毒性数据和急性毒性大鼠数据(以处理大鼠的半数致死剂量(LD)表示)的获得结果表明,我们的方法最终可以有效地预测化合物的活性。本研究中使用的所有材料都可通过 GitHub 门户(https://github.com/BoukeliaAbdelbasset/NLPDeepQSAR.git)在线获得。

相似文献

1
A natural language processing approach based on embedding deep learning from heterogeneous compounds for quantitative structure-activity relationship modeling.基于嵌入深度学习的异构化合物自然语言处理方法及其在定量构效关系建模中的应用。
Chem Biol Drug Des. 2020 Sep;96(3):961-972. doi: 10.1111/cbdd.13742.
2
Artificial intelligence to deep learning: machine intelligence approach for drug discovery.人工智能到深度学习:药物发现的机器智能方法。
Mol Divers. 2021 Aug;25(3):1315-1360. doi: 10.1007/s11030-021-10217-3. Epub 2021 Apr 12.
3
FP2VEC: a new molecular featurizer for learning molecular properties.FP2VEC:一种用于学习分子性质的新型分子特征化工具。
Bioinformatics. 2019 Dec 1;35(23):4979-4985. doi: 10.1093/bioinformatics/btz307.
4
Exploration of chemical space with partial labeled noisy student self-training and self-supervised graph embedding.利用部分标记的噪声学生自训练和自监督图嵌入探索化学空间。
BMC Bioinformatics. 2022 May 2;23(Suppl 3):158. doi: 10.1186/s12859-022-04681-3.
5
A new word embedding model integrated with medical knowledge for deep learning-based sentiment classification.一种集成医学知识的新词嵌入模型,用于基于深度学习的情感分类。
Artif Intell Med. 2024 Feb;148:102758. doi: 10.1016/j.artmed.2023.102758. Epub 2024 Jan 8.
6
Investigation of Machine Intelligence in Compound Cell Activity Classification.化合物细胞活动分类中的机器智能研究。
Mol Pharm. 2019 Nov 4;16(11):4472-4484. doi: 10.1021/acs.molpharmaceut.9b00558. Epub 2019 Oct 21.
7
Automated Amharic News Categorization Using Deep Learning Models.基于深度学习模型的阿姆哈拉语新闻自动分类。
Comput Intell Neurosci. 2021 Jul 27;2021:3774607. doi: 10.1155/2021/3774607. eCollection 2021.
8
Learning, visualizing and exploring 16S rRNA structure using an attention-based deep neural network.使用基于注意力的深度神经网络学习、可视化和探索 16S rRNA 结构。
PLoS Comput Biol. 2021 Sep 22;17(9):e1009345. doi: 10.1371/journal.pcbi.1009345. eCollection 2021 Sep.
9
Deep Neural Networks for QSAR.深度学习方法在定量构效关系中的应用。
Methods Mol Biol. 2022;2390:233-260. doi: 10.1007/978-1-0716-1787-8_10.
10
Toxicity Prediction Method Based on Multi-Channel Convolutional Neural Network.基于多通道卷积神经网络的毒性预测方法。
Molecules. 2019 Sep 17;24(18):3383. doi: 10.3390/molecules24183383.

引用本文的文献

1
Large Language Models and Their Applications in Drug Discovery and Development: A Primer.大语言模型及其在药物发现与开发中的应用:入门指南。
Clin Transl Sci. 2025 Apr;18(4):e70205. doi: 10.1111/cts.70205.
2
Computational Tools to Facilitate Early Warning of New Emerging Risk Chemicals.促进新出现的风险化学品早期预警的计算工具。
Toxics. 2024 Oct 12;12(10):736. doi: 10.3390/toxics12100736.
3
Big data and machine learning for materials science.用于材料科学的大数据与机器学习
Discov Mater. 2021;1(1):12. doi: 10.1007/s43939-021-00012-0. Epub 2021 Apr 19.