基于关联分类挖掘的快速规则生物活性预测。

Fast rule-based bioactivity prediction using associative classification mining.

机构信息

Indiana University School of Informatics and Computing, Bloomington, IN, 47408, USA.

出版信息

J Cheminform. 2012 Nov 23;4(1):29. doi: 10.1186/1758-2946-4-29.

DOI:10.1186/1758-2946-4-29

PMID:23176548

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3515428/

Abstract

Relating chemical features to bioactivities is critical in molecular design and is used extensively in the lead discovery and optimization process. A variety of techniques from statistics, data mining and machine learning have been applied to this process. In this study, we utilize a collection of methods, called associative classification mining (ACM), which are popular in the data mining community, but so far have not been applied widely in cheminformatics. More specifically, classification based on predictive association rules (CPAR), classification based on multiple association rules (CMAR) and classification based on association rules (CBA) are employed on three datasets using various descriptor sets. Experimental evaluations on anti-tuberculosis (antiTB), mutagenicity and hERG (the human Ether-a-go-go-Related Gene) blocker datasets show that these three methods are computationally scalable and appropriate for high speed mining. Additionally, they provide comparable accuracy and efficiency to the commonly used Bayesian and support vector machines (SVM) methods, and produce highly interpretable models.

摘要

将化学特征与生物活性相关联在分子设计中至关重要，并且在发现和优化先导化合物的过程中得到了广泛应用。统计、数据挖掘和机器学习领域的各种技术都已经应用于这一过程。在本研究中，我们利用了一系列被称为关联分类挖掘（ACM）的方法，这些方法在数据挖掘领域很受欢迎，但到目前为止还没有在化学信息学中得到广泛应用。更具体地说，我们使用了基于预测关联规则的分类（CPAR）、基于多关联规则的分类（CMAR）和基于关联规则的分类（CBA）三种方法，在三个数据集上使用了不同的描述符集。对抗结核（antiTB）、致突变性和 hERG（人类 Ether-a-go-go-Related Gene）阻滞剂数据集的实验评估表明，这三种方法在计算上具有可扩展性，适合高速挖掘。此外，它们提供了与常用的贝叶斯和支持向量机（SVM）方法相当的准确性和效率，并产生了高度可解释的模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9c4/3515428/e5eb7f207b04/1758-2946-4-29-1.jpg

相似文献

Fast rule-based bioactivity prediction using associative classification mining.

J Cheminform. 2012 Nov 23;4(1):29. doi: 10.1186/1758-2946-4-29.

Discovering associations in biomedical datasets by link-based associative classifier (LAC).

PLoS One. 2012;7(12):e51018. doi: 10.1371/journal.pone.0051018. Epub 2012 Dec 5.

A comprehensive support vector machine binary hERG classification model based on extensive but biased end point hERG data sets.

Chem Res Toxicol. 2011 Jun 20;24(6):934-49. doi: 10.1021/tx200099j. Epub 2011 May 6.

CARSVM: a class association rule-based classification framework and its application to gene expression data.

Artif Intell Med. 2008 Sep;44(1):7-25. doi: 10.1016/j.artmed.2008.05.002. Epub 2008 Jun 30.

Building an associative classifier with multiple minimum supports.

Springerplus. 2016 Apr 26;5:528. doi: 10.1186/s40064-016-2153-1. eCollection 2016.

Logic minimization and rule extraction for identification of functional sites in molecular sequences.

BioData Min. 2012 Aug 16;5(1):10. doi: 10.1186/1756-0381-5-10.

ADMET Evaluation in Drug Discovery. 16. Predicting hERG Blockers by Combining Multiple Pharmacophores and Machine Learning Approaches.

Mol Pharm. 2016 Aug 1;13(8):2855-66. doi: 10.1021/acs.molpharmaceut.6b00471. Epub 2016 Jul 18.

Seminal quality prediction using data mining methods.

Technol Health Care. 2014;22(4):531-45. doi: 10.3233/THC-140816.

Prediction of different types of liver diseases using rule based classification model.

Technol Health Care. 2013;21(5):417-32. doi: 10.3233/THC-130742.

Grey wolf assisted dragonfly-based weighted rule generation for predicting heart disease and breast cancer.

Comput Med Imaging Graph. 2021 Jul;91:101936. doi: 10.1016/j.compmedimag.2021.101936. Epub 2021 Jun 18.

引用本文的文献

Pharmacological affinity fingerprints derived from bioactivity data for the identification of designer drugs.

J Cheminform. 2022 Jun 7;14(1):35. doi: 10.1186/s13321-022-00607-6.

Rule-Based Models for Risk Estimation and Analysis of In-hospital Mortality in Emergency and Critical Care.

Front Med (Lausanne). 2021 Nov 8;8:785711. doi: 10.3389/fmed.2021.785711. eCollection 2021.

Use of Artificial Intelligence and Machine Learning for Discovery of Drugs for Neglected Tropical Diseases.

Front Chem. 2021 Mar 15;9:614073. doi: 10.3389/fchem.2021.614073. eCollection 2021.

Identification of novel activators of the metal responsive transcription factor (MTF-1) using a gene expression biomarker in a microarray compendium.

Metallomics. 2020 Sep 23;12(9):1400-1415. doi: 10.1039/d0mt00071j.

Construction of an integrated database for hERG blocking small molecules.

PLoS One. 2018 Jul 6;13(7):e0199348. doi: 10.1371/journal.pone.0199348. eCollection 2018.

Predicted Biological Activity of Purchasable Chemical Space.

J Chem Inf Model. 2018 Jan 22;58(1):148-164. doi: 10.1021/acs.jcim.7b00316. Epub 2017 Dec 29.

Physicochemical characteristics of structurally determined metabolite-protein and drug-protein binding events with respect to binding specificity.

Front Mol Biosci. 2015 Sep 15;2:51. doi: 10.3389/fmolb.2015.00051. eCollection 2015.

An efficient algorithm coupled with synthetic minority over-sampling technique to classify imbalanced PubChem BioAssay data.

Anal Chim Acta. 2014 Jan 2;806:117-27. doi: 10.1016/j.aca.2013.10.050. Epub 2013 Nov 6.

Discovering associations in biomedical datasets by link-based associative classifier (LAC).

PLoS One. 2012;7(12):e51018. doi: 10.1371/journal.pone.0051018. Epub 2012 Dec 5.

本文引用的文献

The chemical information ontology: provenance and disambiguation for chemical data on the biological semantic web.

PLoS One. 2011;6(10):e25513. doi: 10.1371/journal.pone.0025513. Epub 2011 Oct 3.

Mining significant substructure pairs for interpreting polypharmacology in drug-target network.

PLoS One. 2011 Feb 23;6(2):e16999. doi: 10.1371/journal.pone.0016999.

Discovering protein-DNA binding sequence patterns using association rule mining.

Nucleic Acids Res. 2010 Oct;38(19):6324-37. doi: 10.1093/nar/gkq500. Epub 2010 Jun 6.

A novel method for mining highly imbalanced high-throughput screening data in PubChem.

Bioinformatics. 2009 Dec 15;25(24):3310-6. doi: 10.1093/bioinformatics/btp589. Epub 2009 Oct 13.

SVM model for virtual screening of Lck inhibitors.

J Chem Inf Model. 2009 Apr;49(4):877-85. doi: 10.1021/ci800387z.

Prediction of protein-protein interaction types using association rule based classification.

BMC Bioinformatics. 2009 Jan 28;10:36. doi: 10.1186/1471-2105-10-36.

New drugs against tuberculosis: problems, progress, and evaluation of agents in clinical development.

Antimicrob Agents Chemother. 2009 Mar;53(3):849-62. doi: 10.1128/AAC.00749-08. Epub 2008 Dec 15.

Ligand-target prediction using Winnow and naive Bayesian algorithms and the implications of overall performance statistics.

J Chem Inf Model. 2008 Dec;48(12):2313-25. doi: 10.1021/ci800079x.

Global Bayesian models for the prioritization of antitubercular agents.

J Chem Inf Model. 2008 Dec;48(12):2362-70. doi: 10.1021/ci800143n.

Synthesis, anti-tuberculosis activity and 3D-QSAR study of amino acid conjugates of 4-(adamantan-1-yl) group containing quinolines.

Eur J Med Chem. 2009 May;44(5):2017-29. doi: 10.1016/j.ejmech.2008.10.004. Epub 2008 Oct 11.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于关联分类挖掘的快速规则生物活性预测。

Fast rule-based bioactivity prediction using associative classification mining.

机构信息

Indiana University School of Informatics and Computing, Bloomington, IN, 47408, USA.

出版信息

J Cheminform. 2012 Nov 23;4(1):29. doi: 10.1186/1758-2946-4-29.

DOI:10.1186/1758-2946-4-29

PMID:23176548

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3515428/

Abstract

摘要

基于关联分类挖掘的快速规则生物活性预测。

Fast rule-based bioactivity prediction using associative classification mining.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

基于关联分类挖掘的快速规则生物活性预测。

Fast rule-based bioactivity prediction using associative classification mining.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献