• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

LINGO-DL:一种基于文本的分子相似性搜索方法。

LINGO-DL: a text-based approach for molecular similarity searching.

机构信息

Universite de Lille, Villeneuve d'Ascq cedex, France.

出版信息

J Comput Aided Mol Des. 2021 May;35(5):657-665. doi: 10.1007/s10822-021-00383-9. Epub 2021 Apr 2.

DOI:10.1007/s10822-021-00383-9
PMID:33797669
Abstract

The line notations of chemical structures are more compact than those of graphs and connection tables, so they can be useful for storing and transferring a large number of molecular structures. The simplified molecular input line system (SMILES) representation is the most extensively used, as it is much easier to utilise and comprehend than others, and it can be generated automatically from connection tables. A SMILES represents and encodes the molecule structure. It has been used by an existing method, LINGO, to calculate the molecular similarities and predict the structure-related properties. The LINGO method decomposes a canonical SMILES into a set of substrings of four characters referred to as LINGOs. The purpose of LINGO method is to measure the similarity between a pair of molecules by comparing the LINGOs that occur in each molecule. This paper aims to introduce an alternative version of the LINGO method using LINGOs of different lengths, called LINGO-DL. LINGO-DL is based on the fragmentation of canonical SMILES into substrings of three different lengths rather than one in LINGO method. Retrospective virtual screening experiments with MDDR, DUD, and MUV datasets show that the LINGO-DL outperforms the LINGO method, especially when the active molecules being sought have a high degree of structural heterogeneity.

摘要

化学结构的线式符号比图形和连接表更紧凑,因此它们可用于存储和传输大量分子结构。简化分子输入行系统 (SMILES) 表示法是最广泛使用的,因为它比其他表示法更容易使用和理解,并且可以从连接表自动生成。SMILES 表示并编码分子结构。它已被现有的 LINGO 方法用于计算分子相似性和预测与结构相关的性质。LINGO 方法将规范的 SMILES 分解为一组四个字符的子字符串,称为 LINGOs。LINGO 方法的目的是通过比较每个分子中出现的 LINGOs 来衡量一对分子之间的相似性。本文旨在介绍一种使用不同长度的 LINGOs 的 LINGO 方法的替代版本,称为 LINGO-DL。LINGO-DL 基于将规范的 SMILES 分割成三个不同长度的子字符串,而不是 LINGO 方法中的一个子字符串。对 MDDR、DUD 和 MUV 数据集的回顾性虚拟筛选实验表明,LINGO-DL 优于 LINGO 方法,特别是当所寻找的活性分子具有高度的结构异质性时。

相似文献

1
LINGO-DL: a text-based approach for molecular similarity searching.LINGO-DL:一种基于文本的分子相似性搜索方法。
J Comput Aided Mol Des. 2021 May;35(5):657-665. doi: 10.1007/s10822-021-00383-9. Epub 2021 Apr 2.
2
LINGO, an efficient holographic text based method to calculate biophysical properties and intermolecular similarities.LINGO,一种基于全息文本的高效方法,用于计算生物物理性质和分子间相似性。
J Chem Inf Model. 2005 Mar-Apr;45(2):386-93. doi: 10.1021/ci0496797.
3
Using inverted indices for accelerating LINGO calculations.利用倒排索引加速 LINGO 计算。
J Chem Inf Model. 2011 Mar 28;51(3):597-600. doi: 10.1021/ci100437e. Epub 2011 Feb 18.
4
Improved Deep Learning Based Method for Molecular Similarity Searching Using Stack of Deep Belief Networks.基于深度置信网络堆叠的改进深度学习分子相似性搜索方法。
Molecules. 2020 Dec 29;26(1):128. doi: 10.3390/molecules26010128.
5
Quantum probability ranking principle for ligand-based virtual screening.基于配体的虚拟筛选的量子概率排序原则。
J Comput Aided Mol Des. 2017 Apr;31(4):365-378. doi: 10.1007/s10822-016-0003-4. Epub 2017 Feb 20.
6
SIML: a fast SIMD algorithm for calculating LINGO chemical similarities on GPUs and CPUs.SIML:一种在 GPU 和 CPU 上计算 LINGO 化学相似度的快速 SIMD 算法。
J Chem Inf Model. 2010 Apr 26;50(4):560-4. doi: 10.1021/ci100011z.
7
Development of R-Group Fingerprints Based on the Local Landscape from an Attachment Point of a Molecular Structure.基于分子结构连接点处的局域景观开发 R 基团指纹。
J Chem Inf Model. 2019 Jun 24;59(6):2656-2663. doi: 10.1021/acs.jcim.9b00122. Epub 2019 May 6.
8
SABRE: ligand/structure-based virtual screening approach using consensus molecular-shape pattern recognition.SABRE:基于配体/结构的虚拟筛选方法,使用共识分子形状模式识别。
J Chem Inf Model. 2014 Jan 27;54(1):338-46. doi: 10.1021/ci4005496. Epub 2013 Dec 23.
9
De Novo Molecule Design by Translating from Reduced Graphs to SMILES.从头设计分子:从简化图到 SMILES 的转换。
J Chem Inf Model. 2019 Mar 25;59(3):1136-1146. doi: 10.1021/acs.jcim.8b00626. Epub 2018 Dec 21.
10
Iterative Screening Methods for Identification of Chemical Compounds with Specific Values of Various Properties.迭代筛选方法,用于鉴定具有各种性质特定值的化学化合物。
J Chem Inf Model. 2019 Jun 24;59(6):2626-2641. doi: 10.1021/acs.jcim.9b00093. Epub 2019 May 6.

本文引用的文献

1
A review of ligand-based virtual screening web tools and screening algorithms in large molecular databases in the age of big data.大数据时代基于配体的虚拟筛选网络工具和筛选算法在大型分子数据库中的研究进展。
Future Med Chem. 2018 Nov;10(22):2641-2658. doi: 10.4155/fmc-2018-0076. Epub 2018 Nov 30.
2
Computational methods in drug discovery.药物发现中的计算方法。
Pharmacol Rev. 2013 Dec 31;66(1):334-95. doi: 10.1124/pr.112.007336. Print 2014.
3
SMIfp (SMILES fingerprint) chemical space for virtual screening and visualization of large databases of organic molecules.
SMIfp(SMILES 指纹)化学空间可用于大型有机分子数据库的虚拟筛选和可视化。
J Chem Inf Model. 2013 Aug 26;53(8):1979-89. doi: 10.1021/ci400206h. Epub 2013 Jul 30.
4
Ligand expansion in ligand-based virtual screening using relevance feedback.基于配体的虚拟筛选中的配体扩展使用相关性反馈。
J Comput Aided Mol Des. 2012 Mar;26(3):279-87. doi: 10.1007/s10822-012-9543-4. Epub 2012 Jan 17.
5
New fragment weighting scheme for the Bayesian inference network in ligand-based virtual screening.基于配体的虚拟筛选中贝叶斯推理网络的新片段加权方案。
J Chem Inf Model. 2011 Jan 24;51(1):25-32. doi: 10.1021/ci100232h. Epub 2010 Dec 14.
6
Ligand-based virtual screening using Bayesian networks.基于配体的贝叶斯网络虚拟筛选。
J Chem Inf Model. 2010 Jun 28;50(6):1012-20. doi: 10.1021/ci100090p.
7
Optimal assignment methods for ligand-based virtual screening.基于配体的虚拟筛选的最优分配方法。
J Cheminform. 2009 Aug 25;1:14. doi: 10.1186/1758-2946-1-14.
8
Maximum unbiased validation (MUV) data sets for virtual screening based on PubChem bioactivity data.基于PubChem生物活性数据的虚拟筛选最大无偏验证(MUV)数据集。
J Chem Inf Model. 2009 Feb;49(2):169-84. doi: 10.1021/ci8002649.
9
How similar are similarity searching methods? A principal component analysis of molecular descriptor space.相似性搜索方法的相似程度如何?分子描述符空间的主成分分析。
J Chem Inf Model. 2009 Jan;49(1):108-19. doi: 10.1021/ci800249s.
10
SYBYL line notation (SLN): a single notation to represent chemical structures, queries, reactions, and virtual libraries.SYBYL 线式表示法(SLN):一种用于表示化学结构、查询、反应和虚拟库的单一表示法。
J Chem Inf Model. 2008 Dec;48(12):2294-307. doi: 10.1021/ci7004687.