• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

分子指纹的可视化。

Visualization of molecular fingerprints.

机构信息

Nonlinearity and Complexity Research Group, Aston University, Aston Triangle, Birmingham B4 7ET, United Kingdom.

出版信息

J Chem Inf Model. 2011 Jul 25;51(7):1552-63. doi: 10.1021/ci1004042. Epub 2011 Jul 8.

DOI:10.1021/ci1004042
PMID:21696145
Abstract

A visualization plot of a data set of molecular data is a useful tool for gaining insight into a set of molecules. In chemoinformatics, most visualization plots are of molecular descriptors, and the statistical model most often used to produce a visualization is principal component analysis (PCA). This paper takes PCA, together with four other statistical models (NeuroScale, GTM, LTM, and LTM-LIN), and evaluates their ability to produce clustering in visualizations not of molecular descriptors but of molecular fingerprints. Two different tasks are addressed: understanding structural information (particularly combinatorial libraries) and relating structure to activity. The quality of the visualizations is compared both subjectively (by visual inspection) and objectively (with global distance comparisons and local k-nearest-neighbor predictors). On the data sets used to evaluate clustering by structure, LTM is found to perform significantly better than the other models. In particular, the clusters in LTM visualization space are consistent with the relationships between the core scaffolds that define the combinatorial sublibraries. On the data sets used to evaluate clustering by activity, LTM again gives the best performance but by a smaller margin. The results of this paper demonstrate the value of using both a nonlinear projection map and a Bernoulli noise model for modeling binary data.

摘要

数据集的分子数据可视化图是深入了解一组分子的有用工具。在化学信息学中,大多数可视化图都是分子描述符,最常用于生成可视化图的统计模型通常是主成分分析(PCA)。本文采用 PCA 以及其他四个统计模型(NeuroScale、GTM、LTM 和 LTM-LIN),评估它们在不基于分子描述符、而是基于分子指纹的可视化图中产生聚类的能力。本文解决了两个不同的任务:理解结构信息(特别是组合库)和将结构与活性相关联。通过主观(通过视觉检查)和客观(通过全局距离比较和局部 k-最近邻预测器)比较了可视化图的质量。在所使用的数据集中,LTM 在评估结构聚类方面的表现明显优于其他模型。特别是,LTM 可视化空间中的聚类与定义组合子库的核心支架之间的关系一致。在所使用的数据集中,LTM 再次给出了最佳性能,但差距较小。本文的结果表明,使用非线性投影图和伯努利噪声模型对二进制数据进行建模具有价值。

相似文献

1
Visualization of molecular fingerprints.分子指纹的可视化。
J Chem Inf Model. 2011 Jul 25;51(7):1552-63. doi: 10.1021/ci1004042. Epub 2011 Jul 8.
2
Nonlinear dimensionality reduction and mapping of compound libraries for drug discovery.化合物库的非线性维度缩减和映射在药物发现中的应用。
J Mol Graph Model. 2012 Apr;34:108-17. doi: 10.1016/j.jmgm.2011.12.006. Epub 2012 Jan 2.
3
Comparison of combinatorial clustering methods on pharmacological data sets represented by machine learning-selected real molecular descriptors.基于机器学习筛选的真实分子描述符的药理学数据集的组合聚类方法比较。
J Chem Inf Model. 2011 Dec 27;51(12):3036-49. doi: 10.1021/ci2000083. Epub 2011 Dec 9.
4
Data visualization during the early stages of drug discovery.药物发现早期阶段的数据可视化。
J Chem Inf Model. 2006 Jul-Aug;46(4):1806-18. doi: 10.1021/ci050471a.
5
A comprehensive support vector machine binary hERG classification model based on extensive but biased end point hERG data sets.基于广泛但存在偏倚的终点 hERG 数据集的全面支持向量机二进制 hERG 分类模型。
Chem Res Toxicol. 2011 Jun 20;24(6):934-49. doi: 10.1021/tx200099j. Epub 2011 May 6.
6
Combinatorial QSAR modeling of specificity and subtype selectivity of ligands binding to serotonin receptors 5HT1E and 5HT1F.与5-羟色胺受体5HT1E和5HT1F结合的配体特异性和亚型选择性的组合定量构效关系建模
J Chem Inf Model. 2008 May;48(5):997-1013. doi: 10.1021/ci700404c. Epub 2008 May 10.
7
Visualization of high-dimensional combinatorial catalysis data.高维组合催化数据的可视化
J Comb Chem. 2009 May-Jun;11(3):385-92. doi: 10.1021/cc800194j.
8
Generative topographic mapping applied to clustering and visualization of motor unit action potentials.生成地形映射应用于运动单位动作电位的聚类和可视化。
Biosystems. 2005 Dec;82(3):273-84. doi: 10.1016/j.biosystems.2005.09.004. Epub 2005 Oct 19.
9
Supervised self-organizing maps in drug discovery. 1. Robust behavior with overdetermined data sets.药物发现中的监督自组织映射。1. 超定数据集的稳健行为。
J Chem Inf Model. 2005 Nov-Dec;45(6):1749-58. doi: 10.1021/ci0500839.
10
Statistical analysis and compound selection of combinatorial libraries for soluble epoxide hydrolase.组合文库中环氧化物水解酶可溶性的统计分析与化合物选择
J Chem Inf Model. 2011 Jul 25;51(7):1582-92. doi: 10.1021/ci200123y. Epub 2011 Jun 20.

引用本文的文献

1
Discovery of Active Ingredient of Yinchenhao Decoction Targeting TLR4 for Hepatic Inflammatory Diseases Based on Deep Learning Approach.基于深度学习方法发现茵陈蒿汤治疗肝脏炎症性疾病的靶向TLR4活性成分
Interdiscip Sci. 2025 Jun;17(2):293-305. doi: 10.1007/s12539-024-00670-7. Epub 2024 Nov 19.
2
Scaffold and Structural Diversity of the Secondary Metabolite Space of Medicinal Fungi.药用真菌次生代谢产物空间的支架与结构多样性
ACS Omega. 2023 Jan 10;8(3):3102-3113. doi: 10.1021/acsomega.2c06428. eCollection 2023 Jan 24.
3
Discovery of novel chemical reactions by deep generative recurrent neural network.
通过深度生成递归神经网络发现新的化学反应。
Sci Rep. 2021 Feb 4;11(1):3178. doi: 10.1038/s41598-021-81889-y.
4
Chemistry in Times of Artificial Intelligence.人工智能时代的化学。
Chemphyschem. 2020 Oct 16;21(20):2233-2242. doi: 10.1002/cphc.202000518. Epub 2020 Sep 28.
5
VAE-Sim: A Novel Molecular Similarity Measure Based on a Variational Autoencoder.VAE-Sim:一种基于变分自动编码器的新型分子相似性度量方法。
Molecules. 2020 Jul 29;25(15):3446. doi: 10.3390/molecules25153446.
6
A Novel Discovery: Holistic Efficacy at the Special Organ Level of Pungent Flavored Compounds from Pungent Traditional Chinese Medicine.一项新发现:辛味中药中辛味化合物在特殊器官水平的整体功效。
Int J Mol Sci. 2019 Feb 11;20(3):752. doi: 10.3390/ijms20030752.
7
Distributed Representation of Chemical Fragments.化学片段的分布式表示
ACS Omega. 2018 Mar 31;3(3):2825-2836. doi: 10.1021/acsomega.7b02045. Epub 2018 Mar 8.
8
Machine learning in chemoinformatics and drug discovery.机器学习在化学生信学和药物发现中的应用。
Drug Discov Today. 2018 Aug;23(8):1538-1546. doi: 10.1016/j.drudis.2018.05.010. Epub 2018 May 8.
9
Cheminformatic characterization of natural products from Panama.巴拿马天然产物的 cheminformatic 特征描述。
Mol Divers. 2017 Nov;21(4):779-789. doi: 10.1007/s11030-017-9781-4. Epub 2017 Aug 22.
10
Predictive cartography of metal binders using generative topographic mapping.使用生成地形映射对金属粘合剂进行预测制图。
J Comput Aided Mol Des. 2017 Aug;31(8):701-714. doi: 10.1007/s10822-017-0033-6. Epub 2017 Jul 7.