• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于原子对性质的新型描述符。

A novel descriptor based on atom-pair properties.

作者信息

Kuroda Masataka

机构信息

Discovery Technology Laboratories, Innovative Research Division, Mitsubishi Tanabe Pharma Corporation, 1000 Kamoshida, Aoba-ku, Yokohama, 227-0033 Japan.

出版信息

J Cheminform. 2017 Jan 5;9:1. doi: 10.1186/s13321-016-0187-6. eCollection 2017.

DOI:10.1186/s13321-016-0187-6
PMID:28316652
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5270600/
Abstract

BACKGROUND

Molecular descriptors have been widely used to predict biological activities and physicochemical properties or to analyze chemical libraries on the basis of similarity. Although fingerprints and properties are generally used as descriptors, neither is perfect for these purposes. A fingerprint can distinguish between molecules, whereas a property may not do the same in certain cases, and vice versa. When the number of the training set is especially small, the construction of good predictive models is difficult. Herein, a novel descriptor integrating mutually compensating fingerprint and property characteristics is described. The format of this descriptor is not conventional. It has two dimensions with variable length in one dimension to represent one molecule. This format is not acceptable for any machine learning methods. Therefore the distance between molecules has been newly defined for application to machine learning techniques. The evaluation of this descriptor, as applied to classification tasks, was performed using a support vector machine after the features of the descriptor had been optimized by a genetic algorithm.

RESULTS

Because the optimizing feature is time-intensive due to the complicated calculation of distances between molecules, the optimization was forced to stop before it was completed. As a result, no remarkable improvement was observed in the classification results for the new descriptor compared with those for other descriptors in any evaluation set used in this work. However, extremely low accuracies were also not found for any set.

CONCLUSIONS

The novel descriptor proposed in this work can potentially be used to make highly accurate predictive models. This new concept in descriptors is expected to be useful for developing novel predictive methods with quick training and high accuracy.

摘要

背景

分子描述符已被广泛用于预测生物活性和物理化学性质,或基于相似性分析化学文库。虽然指纹和性质通常用作描述符,但两者在这些用途上都并非完美。指纹可以区分分子,而在某些情况下性质可能无法做到,反之亦然。当训练集数量特别小时,构建良好的预测模型很困难。在此,描述了一种整合相互补偿的指纹和性质特征的新型描述符。这种描述符的格式并不常规。它有两个维度,其中一个维度的长度可变以表示一个分子。这种格式对于任何机器学习方法都是不可接受的。因此,为了应用于机器学习技术,新定义了分子之间的距离。在通过遗传算法对描述符的特征进行优化之后,使用支持向量机对该描述符应用于分类任务进行评估。

结果

由于优化特征因分子间距离的复杂计算而耗时,优化在完成前被迫停止。结果,与本研究中使用的任何评估集中其他描述符的分类结果相比,新描述符的分类结果没有观察到显著改善。然而,任何集合也未发现极低的准确率。

结论

本工作中提出的新型描述符有潜力用于构建高度准确的预测模型。描述符中的这一新概念有望用于开发具有快速训练和高精度的新型预测方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/7ebf2393f85d/13321_2016_187_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/e2cf6ac10c5f/13321_2016_187_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/f33c5d079c22/13321_2016_187_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/3a046adcb4de/13321_2016_187_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/f9a9418cae5c/13321_2016_187_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/2f8c4a7103cb/13321_2016_187_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/7ebf2393f85d/13321_2016_187_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/e2cf6ac10c5f/13321_2016_187_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/f33c5d079c22/13321_2016_187_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/3a046adcb4de/13321_2016_187_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/f9a9418cae5c/13321_2016_187_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/2f8c4a7103cb/13321_2016_187_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a854/5270600/7ebf2393f85d/13321_2016_187_Fig6_HTML.jpg

相似文献

1
A novel descriptor based on atom-pair properties.一种基于原子对性质的新型描述符。
J Cheminform. 2017 Jan 5;9:1. doi: 10.1186/s13321-016-0187-6. eCollection 2017.
2
Design and evaluation of a molecular fingerprint involving the transformation of property descriptor values into a binary classification scheme.一种涉及将性质描述符值转化为二元分类方案的分子指纹的设计与评估。
J Chem Inf Comput Sci. 2003 Jul-Aug;43(4):1151-7. doi: 10.1021/ci030285+.
3
Improved Prediction of Blood-Brain Barrier Permeability Through Machine Learning with Combined Use of Molecular Property-Based Descriptors and Fingerprints.通过机器学习结合分子性质基描述符和指纹提高血脑屏障通透性的预测。
AAPS J. 2018 Mar 21;20(3):54. doi: 10.1208/s12248-018-0215-8.
4
Improving virtual screening predictive accuracy of Human kallikrein 5 inhibitors using machine learning models.使用机器学习模型提高人激肽释放酶5抑制剂的虚拟筛选预测准确性。
Comput Biol Chem. 2017 Aug;69:110-119. doi: 10.1016/j.compbiolchem.2017.05.007. Epub 2017 May 29.
5
An Ensemble Structure and Physicochemical (SPOC) Descriptor for Machine-Learning Prediction of Chemical Reaction and Molecular Properties.用于机器学习预测化学反应和分子性质的集成结构和物理化学(SPOC)描述符。
Chemphyschem. 2022 Jul 19;23(14):e202200255. doi: 10.1002/cphc.202200255. Epub 2022 May 19.
6
How diverse are diversity assessment methods? A comparative analysis and benchmarking of molecular descriptor space.多样性评估方法有哪些差异?分子描述符空间的比较分析和基准测试。
J Chem Inf Model. 2014 Jan 27;54(1):230-42. doi: 10.1021/ci400469u. Epub 2013 Dec 13.
7
Fast and accurate prediction of partial charges using Atom-Path-Descriptor-based machine learning.基于原子路径描述符的机器学习快速准确预测部分电荷。
Bioinformatics. 2020 Sep 15;36(18):4721-4728. doi: 10.1093/bioinformatics/btaa566.
8
Harnessing Shannon entropy-based descriptors in machine learning models to enhance the prediction accuracy of molecular properties.在机器学习模型中利用基于香农熵的描述符来提高分子性质的预测准确性。
J Cheminform. 2023 May 21;15(1):54. doi: 10.1186/s13321-023-00712-0.
9
Effect of molecular descriptor feature selection in support vector machine classification of pharmacokinetic and toxicological properties of chemical agents.分子描述符特征选择在化学药剂药代动力学和毒理学性质支持向量机分类中的作用
J Chem Inf Comput Sci. 2004 Sep-Oct;44(5):1630-8. doi: 10.1021/ci049869h.
10
Improvement of Prediction Performance With Conjoint Molecular Fingerprint in Deep Learning.深度学习中联合分子指纹对预测性能的提升
Front Pharmacol. 2020 Dec 18;11:606668. doi: 10.3389/fphar.2020.606668. eCollection 2020.

引用本文的文献

1
Simple User-Friendly Reaction Format.简单易用的反应格式。
Mol Inform. 2025 Jan;44(1):e202400361. doi: 10.1002/minf.202400361.
2
A novel approach for target deconvolution from phenotype-based screening using knowledge graph.一种使用知识图谱从基于表型的筛选中进行靶点反卷积的新方法。
Sci Rep. 2025 Jan 18;15(1):2414. doi: 10.1038/s41598-025-86166-w.
3
Addressing the need for individual-level exposure monitoring for firefighters using silicone samplers.利用硅胶采样器满足消防员个人层面暴露监测的需求。

本文引用的文献

1
Molecular graph convolutions: moving beyond fingerprints.分子图卷积:超越指纹图谱
J Comput Aided Mol Des. 2016 Aug;30(8):595-608. doi: 10.1007/s10822-016-9938-8. Epub 2016 Aug 24.
2
A Short Review of the Generation of Molecular Descriptors and Their Applications in Quantitative Structure Property/Activity Relationships.分子描述符的生成及其在定量结构性质/活性关系中的应用简述
Curr Comput Aided Drug Des. 2016;12(3):181-205. doi: 10.2174/1573409912666160525112114.
3
A Study of Applications of Machine Learning Based Classification Methods for Virtual Screening of Lead Molecules.
J Expo Sci Environ Epidemiol. 2025 Apr;35(2):180-195. doi: 10.1038/s41370-024-00700-y. Epub 2024 Jul 20.
4
A review of machine learning-based methods for predicting drug-target interactions.基于机器学习的药物-靶点相互作用预测方法综述。
Health Inf Sci Syst. 2024 Apr 12;12(1):30. doi: 10.1007/s13755-024-00287-6. eCollection 2024 Dec.
5
Toward Quantum-Informed Atom Pairs.迈向量子信息原子对。
ACS Omega. 2024 Jan 26;9(5):5966-5971. doi: 10.1021/acsomega.3c09744. eCollection 2024 Feb 6.
6
Predicting target-ligand interactions with graph convolutional networks for interpretable pharmaceutical discovery.基于图卷积网络的可解释药物发现靶标-配体相互作用预测。
Sci Rep. 2022 May 19;12(1):8434. doi: 10.1038/s41598-022-12180-x.
7
Exploring the Potential of Spherical Harmonics and PCVM for Compounds Activity Prediction.探索球谐函数和 PCVM 在化合物活性预测中的应用潜力。
Int J Mol Sci. 2019 May 2;20(9):2175. doi: 10.3390/ijms20092175.
基于机器学习的分类方法在先导分子虚拟筛选中的应用研究
Comb Chem High Throughput Screen. 2015;18(7):658-72. doi: 10.2174/1386207318666150703112447.
4
Atom environment kernels on molecules.原子环境核在分子上。
J Chem Inf Model. 2014 May 27;54(5):1289-300. doi: 10.1021/ci400403w. Epub 2014 May 6.
5
Descriptor selection methods in quantitative structure-activity relationship studies: a review study.定量构效关系研究中的描述符选择方法:一项综述研究。
Chem Rev. 2013 Oct 9;113(10):8093-103. doi: 10.1021/cr3004339. Epub 2013 Jul 3.
6
Extended-connectivity fingerprints.扩展连接指纹。
J Chem Inf Model. 2010 May 24;50(5):742-54. doi: 10.1021/ci100050t.
7
A Computer Program for Classifying Plants.一个用于植物分类的计算机程序。
Science. 1960 Oct 21;132(3434):1115-8. doi: 10.1126/science.132.3434.1115.
8
Molecular similarity analysis in virtual screening: foundations, limitations and novel approaches.虚拟筛选中的分子相似性分析:基础、局限性及新方法
Drug Discov Today. 2007 Mar;12(5-6):225-33. doi: 10.1016/j.drudis.2007.01.011. Epub 2007 Feb 7.
9
Effect of selection of molecular descriptors on the prediction of blood-brain barrier penetrating and nonpenetrating agents by statistical learning methods.分子描述符的选择对通过统计学习方法预测血脑屏障穿透性和非穿透性药物的影响。
J Chem Inf Model. 2005 Sep-Oct;45(5):1376-84. doi: 10.1021/ci050135u.
10
Molecular similarity: a key technique in molecular informatics.分子相似性:分子信息学中的一项关键技术。
Org Biomol Chem. 2004 Nov 21;2(22):3204-18. doi: 10.1039/B409813G. Epub 2004 Oct 14.