• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过描述符香农熵分析和二元定量构效关系计算区分天然产物和合成分子。

Distinguishing between natural products and synthetic molecules by descriptor Shannon entropy analysis and binary QSAR calculations.

作者信息

Stahura F L, Godden J W, Xue L, Bajorath J

机构信息

Computer-Aided Drug Discovery, New Chemical Entities, Inc., Bothell, Washington 98011, USA.

出版信息

J Chem Inf Comput Sci. 2000 Sep-Oct;40(5):1245-52. doi: 10.1021/ci0003303.

DOI:10.1021/ci0003303
PMID:11045820
Abstract

Molecular descriptors were identified by Shannon entropy analysis that correctly distinguished, in binary QSAR calculations, between naturally occurring molecules and synthetic compounds. The Shannon entropy concept was first used in digital communication theory and has only very recently been applied to descriptor analysis. Binary QSAR methodology was originally developed to correlate structural features and properties of compounds with a binary formulation of biological activity (i.e., active or inactive) and has here been adapted to correlate molecular features with chemical source (i.e., natural or synthetic). We have identified a number of molecular descriptors with significantly different Shannon entropy and/or "entropic separation" in natural and synthetic compound databases. Different combinations of such descriptors and variably distributed structural keys were applied to learning sets consisting of natural and synthetic molecules and used to derive predictive binary QSAR models. These models were then applied to predict the source of compounds in different test sets consisting of randomly collected natural and synthetic molecules, or, alternatively, sets of natural and synthetic molecules with specific biological activities. On average, greater than 80% prediction accuracy was achieved with our best models. For the test case consisting of molecules with specific activities, greater than 90% accuracy was achieved. From our analysis, some chemical features were identified that systematically differ in many naturally occurring versus synthetic molecules.

摘要

通过香农熵分析确定了分子描述符,在二元定量构效关系计算中,这些描述符能够正确区分天然存在的分子和合成化合物。香农熵概念最初用于数字通信理论,直到最近才应用于描述符分析。二元定量构效关系方法最初是为了将化合物的结构特征和性质与生物活性的二元表述(即活性或非活性)相关联而开发的,在此已被调整为将分子特征与化学来源(即天然或合成)相关联。我们在天然和合成化合物数据库中确定了许多具有显著不同香农熵和/或“熵分离”的分子描述符。将这些描述符的不同组合和可变分布的结构键应用于由天然和合成分子组成的学习集,并用于推导预测性二元定量构效关系模型。然后将这些模型应用于预测不同测试集中化合物的来源,这些测试集由随机收集的天然和合成分子组成,或者由具有特定生物活性的天然和合成分子组成。平均而言,我们最好的模型实现了超过80%的预测准确率。对于由具有特定活性的分子组成的测试案例,准确率超过了90%。通过我们的分析,确定了一些在许多天然存在的分子与合成分子中系统地不同的化学特征。

相似文献

1
Distinguishing between natural products and synthetic molecules by descriptor Shannon entropy analysis and binary QSAR calculations.通过描述符香农熵分析和二元定量构效关系计算区分天然产物和合成分子。
J Chem Inf Comput Sci. 2000 Sep-Oct;40(5):1245-52. doi: 10.1021/ci0003303.
2
Differential Shannon entropy analysis identifies molecular property descriptors that predict aqueous solubility of synthetic compounds with high accuracy in binary QSAR calculations.差分香农熵分析可识别分子性质描述符,这些描述符在二元定量构效关系计算中能够高精度地预测合成化合物的水溶性。
J Chem Inf Comput Sci. 2002 May-Jun;42(3):550-8. doi: 10.1021/ci010243q.
3
Differential Shannon Entropy as a sensitive measure of differences in database variability of molecular descriptors.微分香农熵作为分子描述符数据库变异性差异的一种灵敏度量。
J Chem Inf Comput Sci. 2001 Jul-Aug;41(4):1060-6. doi: 10.1021/ci0102867.
4
Evaluation of descriptors and mini-fingerprints for the identification of molecules with similar activity.用于识别具有相似活性分子的描述符和微型指纹图谱评估。
J Chem Inf Comput Sci. 2000 Sep-Oct;40(5):1227-34. doi: 10.1021/ci000327j.
5
Rank order entropy: why one metric is not enough.秩次熵:为何一种度量指标并不够。
J Chem Inf Model. 2011 Sep 26;51(9):2302-19. doi: 10.1021/ci200170k. Epub 2011 Aug 29.
6
Predictive QSAR modeling workflow, model applicability domains, and virtual screening.预测性定量构效关系(QSAR)建模工作流程、模型适用域及虚拟筛选。
Curr Pharm Des. 2007;13(34):3494-504. doi: 10.2174/138161207782794257.
7
Combinatorial QSAR of ambergris fragrance compounds.龙涎香香料化合物的组合定量构效关系
J Chem Inf Comput Sci. 2004 Mar-Apr;44(2):582-95. doi: 10.1021/ci034203t.
8
Chemical descriptors with distinct levels of information content and varying sensitivity to differences between selected compound databases identified by SE-DSE analysis.通过SE-DSE分析确定的具有不同信息含量水平以及对所选化合物数据库之间差异具有不同敏感性的化学描述符。
J Chem Inf Comput Sci. 2002 Jan-Feb;42(1):87-93. doi: 10.1021/ci0103065.
9
Using kernel alignment to select features of molecular descriptors in a QSAR study.使用核对齐选择 QSAR 研究中分子描述符的特征。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1373-84. doi: 10.1109/TCBB.2011.31.
10
Variability of molecular descriptors in compound databases revealed by Shannon entropy calculations.通过香农熵计算揭示化合物数据库中分子描述符的变异性。
J Chem Inf Comput Sci. 2000 May;40(3):796-800. doi: 10.1021/ci000321u.

引用本文的文献

1
Natural product drug discovery in the artificial intelligence era.人工智能时代的天然产物药物发现
Chem Sci. 2021 Dec 13;13(6):1526-1546. doi: 10.1039/d1sc04471k. eCollection 2022 Feb 9.
2
Chemoinformatics View on Bitter Taste Receptor Agonists in Food.食品中苦味受体激动剂的化学生信学观点
J Agric Food Chem. 2021 Nov 24;69(46):13916-13924. doi: 10.1021/acs.jafc.1c05057. Epub 2021 Nov 11.
3
NPASS: natural product activity and species source database for natural product research, discovery and tool development.NPASS:天然产物活性和物种来源数据库,用于天然产物研究、发现和工具开发。
Nucleic Acids Res. 2018 Jan 4;46(D1):D1217-D1222. doi: 10.1093/nar/gkx1026.
4
Discovery and resupply of pharmacologically active plant-derived natural products: A review.药理活性植物源天然产物的发现与再供应:综述
Biotechnol Adv. 2015 Dec;33(8):1582-1614. doi: 10.1016/j.biotechadv.2015.08.001. Epub 2015 Aug 15.
5
Prediction of multi-target networks of neuroprotective compounds with entropy indices and synthesis, assay, and theoretical study of new asymmetric 1,2-rasagiline carbamates.利用熵指数预测神经保护化合物的多靶点网络以及新型不对称1,2-雷沙吉兰氨基甲酸酯的合成、测定和理论研究
Int J Mol Sci. 2014 Sep 24;15(9):17035-64. doi: 10.3390/ijms150917035.
6
Information properties of naturally-occurring proteins: Fourier analysis and complexity phase plots.天然蛋白质的信息特性:傅里叶分析和复杂度相图。
Protein J. 2012 Oct;31(7):550-63. doi: 10.1007/s10930-012-9432-7.
7
Physiochemical property space distribution among human metabolites, drugs and toxins.人体代谢物、药物和毒素的物理化学性质空间分布。
BMC Bioinformatics. 2009 Dec 3;10 Suppl 15(Suppl 15):S10. doi: 10.1186/1471-2105-10-S15-S10.
8
On the information expressed in enzyme primary structure: lessons from Ribonuclease A.在酶一级结构所表达的信息:核糖核酸酶 A 的启示。
Mol Divers. 2010 Nov;14(4):673-86. doi: 10.1007/s11030-009-9211-3. Epub 2009 Nov 17.
9
A natural history of botanical therapeutics.植物疗法的自然史。
Metabolism. 2008 Jul;57(7 Suppl 1):S3-9. doi: 10.1016/j.metabol.2008.03.001.
10
Virtual screening for the discovery of bioactive natural products.用于发现生物活性天然产物的虚拟筛选
Prog Drug Res. 2008;65:211, 213-49. doi: 10.1007/978-3-7643-8117-2_6.