• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过在化学相似性网络上进行传播来探索用于先导物识别的化学空间。

Exploring chemical space for lead identification by propagating on chemical similarity network.

作者信息

Yi Jungseob, Lee Sangseon, Lim Sangsoo, Cho Changyun, Piao Yinhua, Yeo Marie, Kim Dongkyu, Kim Sun, Lee Sunho

机构信息

Interdisciplinary Program in Artificial Intelligence, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea.

Institute of Computer Technology, Seoul National University, Gwanak-ro 1, Gwanak-gu, Seoul, 08826, South Korea.

出版信息

Comput Struct Biotechnol J. 2023 Aug 25;21:4187-4195. doi: 10.1016/j.csbj.2023.08.016. eCollection 2023.

DOI:10.1016/j.csbj.2023.08.016
PMID:37680266
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10480321/
Abstract

MOTIVATION

Lead identification is a fundamental step to prioritize candidate compounds for downstream drug discovery process. Machine learning (ML) and deep learning (DL) approaches are widely used to identify lead compounds using both chemical property and experimental information. However, ML or DL methods rarely consider compound similarity information directly since ML and DL models use abstract representation of molecules for model construction. Alternatively, data mining approaches are also used to explore chemical space with drug candidates by screening undesirable compounds. A major challenge for data mining approaches is to develop efficient data mining methods that search large chemical space for desirable lead compounds with low false positive rate.

RESULTS

In this work, we developed a network propagation (NP) based data mining method for lead identification that performs search on an ensemble of chemical similarity networks. We compiled 14 fingerprint-based similarity networks. Given a target protein of interest, we use a deep learning-based drug target interaction model to narrow down compound candidates and then we use network propagation to prioritize drug candidates that are highly correlated with drug activity score such as IC. In an extensive experiment with BindingDB, we showed that our approach successfully discovered intentionally unlabeled compounds for given targets. To further demonstrate the prediction power of our approach, we identified 24 candidate leads for CLK1. Two out of five synthesizable candidates were experimentally validated in binding assays. In conclusion, our framework can be very useful for lead identification from very large compound databases such as ZINC.

摘要

动机

先导化合物的识别是为下游药物发现过程确定候选化合物优先级的基本步骤。机器学习(ML)和深度学习(DL)方法被广泛用于利用化学性质和实验信息来识别先导化合物。然而,ML或DL方法很少直接考虑化合物相似性信息,因为ML和DL模型在构建模型时使用分子的抽象表示。另外,数据挖掘方法也被用于通过筛选不良化合物来探索含有候选药物的化学空间。数据挖掘方法面临的一个主要挑战是开发高效的数据挖掘方法,以便在大型化学空间中搜索具有低假阳性率的理想先导化合物。

结果

在这项工作中,我们开发了一种基于网络传播(NP)的数据挖掘方法用于先导化合物识别,该方法在一组化学相似性网络上进行搜索。我们编制了14个基于指纹的相似性网络。给定一个感兴趣的目标蛋白,我们使用基于深度学习的药物-靶点相互作用模型来缩小候选化合物范围,然后使用网络传播来对与诸如IC等药物活性评分高度相关的候选药物进行优先级排序。在与BindingDB进行的广泛实验中,我们表明我们的方法成功发现了针对给定靶点的未标记化合物。为了进一步证明我们方法的预测能力,我们为CLK1识别了24个候选先导化合物。在可合成的五个候选化合物中,有两个在结合试验中得到了实验验证。总之,我们的框架对于从诸如ZINC这样的非常大的化合物数据库中识别先导化合物非常有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/1b6ad9edc760/gr007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/b359816e6181/gr001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/79446ea08d12/gr002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/d9f5d954d898/gr003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/e0bf0ff13fdc/gr004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/7079ef38390f/gr005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/3ade40d73e90/gr006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/1b6ad9edc760/gr007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/b359816e6181/gr001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/79446ea08d12/gr002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/d9f5d954d898/gr003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/e0bf0ff13fdc/gr004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/7079ef38390f/gr005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/3ade40d73e90/gr006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2dd1/10480321/1b6ad9edc760/gr007.jpg

相似文献

1
Exploring chemical space for lead identification by propagating on chemical similarity network.通过在化学相似性网络上进行传播来探索用于先导物识别的化学空间。
Comput Struct Biotechnol J. 2023 Aug 25;21:4187-4195. doi: 10.1016/j.csbj.2023.08.016. eCollection 2023.
2
A big data approach with artificial neural network and molecular similarity for chemical data mining and endocrine disruption prediction.一种结合人工神经网络和分子相似性的大数据方法用于化学数据挖掘和内分泌干扰预测。
Indian J Pharmacol. 2018 Jul-Aug;50(4):169-176. doi: 10.4103/ijp.IJP_304_17.
3
LEAP into the Pfizer Global Virtual Library (PGVL) space: creation of readily synthesizable design ideas automatically.跃入辉瑞全球虚拟图书馆(PGVL)空间:自动生成易于合成的设计理念。
Methods Mol Biol. 2011;685:253-76. doi: 10.1007/978-1-60761-931-4_13.
4
Network Science and Group Fusion Similarity-Based Searching to Explore the Chemical Space of Antiparasitic Peptides.基于网络科学和群组融合相似性的搜索以探索抗寄生虫肽的化学空间
ACS Omega. 2022 Dec 6;7(50):46012-46036. doi: 10.1021/acsomega.2c03398. eCollection 2022 Dec 20.
5
DeepCPI: A Deep Learning-based Framework for Large-scale in silico Drug Screening.DeepCPI:一种基于深度学习的大规模计算机药物筛选框架。
Genomics Proteomics Bioinformatics. 2019 Oct;17(5):478-495. doi: 10.1016/j.gpb.2019.04.003. Epub 2020 Feb 6.
6
Deep mining heterogeneous networks of biomedical linked data to predict novel drug-target associations.深度挖掘生物医学关联数据的异构网络以预测新型药物-靶点关联。
Bioinformatics. 2017 Aug 1;33(15):2337-2344. doi: 10.1093/bioinformatics/btx160.
7
QPoweredCompound2DeNovoDrugPropMax - a novel programmatic tool incorporating deep learning and methods for automated in silico bio-activity discovery for any compound of interest.QPoweredCompound2DeNovoDrugPropMax——一种新颖的编程工具,融合深度学习和方法,可对任何感兴趣的化合物进行自动化的计算机虚拟生物活性发现。
J Biomol Struct Dyn. 2023 Mar;41(5):1790-1797. doi: 10.1080/07391102.2021.2024450. Epub 2022 Jan 10.
8
Convolutional neural network based on SMILES representation of compounds for detecting chemical motif.基于化合物 SMILES 表示的卷积神经网络用于检测化学基序。
BMC Bioinformatics. 2018 Dec 31;19(Suppl 19):526. doi: 10.1186/s12859-018-2523-5.
9
Predicting lncRNA-disease associations using network topological similarity based on deep mining heterogeneous networks.基于深度挖掘异质网络的网络拓扑相似性预测 lncRNA-疾病关联。
Math Biosci. 2019 Sep;315:108229. doi: 10.1016/j.mbs.2019.108229. Epub 2019 Jul 16.
10
Identification of drug candidates and repurposing opportunities through compound-target interaction networks.通过化合物-靶标相互作用网络鉴定药物候选物和再利用机会。
Expert Opin Drug Discov. 2015 Dec;10(12):1333-45. doi: 10.1517/17460441.2015.1096926. Epub 2015 Oct 1.

引用本文的文献

1
Mlp4green: A Binary Classification Approach Specifically for Green Odor.Mlp4green:一种专门针对绿色气味的二进制分类方法。
Int J Mol Sci. 2024 Mar 20;25(6):3515. doi: 10.3390/ijms25063515.

本文引用的文献

1
On modeling and utilizing chemical compound information with deep learning technologies: A task-oriented approach.利用深度学习技术进行化合物信息建模与应用:一种面向任务的方法。
Comput Struct Biotechnol J. 2022 Aug 5;20:4288-4304. doi: 10.1016/j.csbj.2022.07.049. eCollection 2022.
2
Chemical Multiverse: An Expanded View of Chemical Space.化学多元宇宙:化学空间的扩展视角。
Mol Inform. 2022 Nov;41(11):e2200116. doi: 10.1002/minf.202200116. Epub 2022 Aug 23.
3
Yes SIR! On the structure-inactivity relationships in drug discovery.
是的,先生!关于药物发现中的构效关系。
Drug Discov Today. 2022 Aug;27(8):2353-2362. doi: 10.1016/j.drudis.2022.05.005. Epub 2022 May 11.
4
What are the current challenges for machine learning in drug discovery and repurposing?机器学习在药物发现和药物再利用方面目前面临哪些挑战?
Expert Opin Drug Discov. 2022 May;17(5):423-425. doi: 10.1080/17460441.2022.2050694. Epub 2022 Mar 8.
5
Retrosynthetic reaction pathway prediction through neural machine translation of atomic environments.通过原子环境的神经机器翻译预测反合成反应途径。
Nat Commun. 2022 Mar 4;13(1):1186. doi: 10.1038/s41467-022-28857-w.
6
Diversity and Chemical Library Networks of Large Data Sets.大数据集的多样性和化学文库网络。
J Chem Inf Model. 2022 May 9;62(9):2186-2201. doi: 10.1021/acs.jcim.1c01013. Epub 2021 Nov 1.
7
Using molecular embeddings in QSAR modeling: does it make a difference?在定量构效关系建模中使用分子嵌入:有区别吗?
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab365.
8
Graph representation learning in bioinformatics: trends, methods and applications.生物信息学中的图表示学习:趋势、方法和应用。
Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab340.
9
A review on compound-protein interaction prediction methods: Data, format, representation and model.复合蛋白相互作用预测方法综述:数据、格式、表示与模型
Comput Struct Biotechnol J. 2021 Mar 10;19:1541-1556. doi: 10.1016/j.csbj.2021.03.004. eCollection 2021.
10
Generative chemistry: drug discovery with deep learning generative models.生成化学:用深度学习生成模型进行药物发现。
J Mol Model. 2021 Feb 4;27(3):71. doi: 10.1007/s00894-021-04674-8.