基于未报告特征为阴性的预测化学信息学假设

On the Unreported-Profile-is-Negative Assumption for Predictive Cheminformatics.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2020 Jul-Aug;17(4):1352-1363. doi: 10.1109/TCBB.2019.2913855. Epub 2019 Apr 30.

DOI:10.1109/TCBB.2019.2913855

Abstract

In cheminformatics, compound-target binding profiles has been a main source of data for research. For data repositories that only provide positive profiles, a popular assumption is that unreported profiles are all negative. In this paper, we caution the audience not to take this assumption for granted, and present empirical evidence of its ineffectiveness from a machine learning perspective. Our examination is based on a setting where binding profiles are used as features to train predictive models; we show (1) prediction performance degrades when the assumption fails and (2) explicit recovery of unreported profiles improves prediction performance. In particular, we propose a framework that jointly recovers profiles and learns predictive model, and show it achieves further performance improvement. The presented study not only suggests applying matrix recovery methods to recover unreported profiles, but also initiates a new missing feature problem which we called Learning with Positive and Unknown Features.

摘要

在化学信息学中，化合物-靶标结合谱一直是研究的主要数据来源。对于仅提供阳性谱的数据存储库，一个流行的假设是未报告的谱都是阴性的。在本文中，我们提醒读者不要想当然地认为这一假设成立，并从机器学习的角度提供了实证证据证明其无效性。我们的检查基于这样一种情况，即结合谱被用作特征来训练预测模型；我们展示了（1）当假设失败时，预测性能会下降，以及（2）显式恢复未报告的谱可以提高预测性能。具体来说，我们提出了一个联合恢复谱和学习预测模型的框架，并展示了它可以实现进一步的性能提升。本研究不仅建议应用矩阵恢复方法来恢复未报告的谱，还引发了一个新的缺失特征问题，我们称之为带有阳性和未知特征的学习。

相似文献

On the Unreported-Profile-is-Negative Assumption for Predictive Cheminformatics.基于未报告特征为阴性的预测化学信息学假设

IEEE/ACM Trans Comput Biol Bioinform. 2020 Jul-Aug;17(4):1352-1363. doi: 10.1109/TCBB.2019.2913855. Epub 2019 Apr 30.

Cheminformatics in Drug Discovery, an Industrial Perspective.药物发现中的 cheminformatics：工业视角。

Mol Inform. 2018 Sep;37(9-10):e1800041. doi: 10.1002/minf.201800041. Epub 2018 May 18.

A Deep-Learning Approach toward Rational Molecular Docking Protocol Selection.深度学习方法在理性分子对接协议选择中的应用。

Molecules. 2020 May 27;25(11):2487. doi: 10.3390/molecules25112487.

PREFER: A New Predictive Modeling Framework for Molecular Discovery.PREFER：一种新的分子发现预测建模框架。

J Chem Inf Model. 2023 Aug 14;63(15):4497-4504. doi: 10.1021/acs.jcim.3c00523. Epub 2023 Jul 24.

HMMPred: Accurate Prediction of DNA-Binding Proteins Based on HMM Profiles and XGBoost Feature Selection.HMMPred：基于 HMM 轮廓和 XGBoost 特征选择的 DNA 结合蛋白精确预测。

Comput Math Methods Med. 2020 Mar 28;2020:1384749. doi: 10.1155/2020/1384749. eCollection 2020.

Effectively Identifying Compound-Protein Interactions by Learning from Positive and Unlabeled Examples.通过从正例和无标签样例中学习来有效识别化合物-蛋白质相互作用。

IEEE/ACM Trans Comput Biol Bioinform. 2018 Nov-Dec;15(6):1832-1843. doi: 10.1109/TCBB.2016.2570211. Epub 2016 May 18.

Learning to Predict Drug Target Interaction From Missing Not at Random Labels.从缺失非随机标签中学习预测药物靶点相互作用。

IEEE Trans Nanobioscience. 2019 Jul;18(3):353-359. doi: 10.1109/TNB.2019.2909293. Epub 2019 Apr 9.

DeepChemStable: Chemical Stability Prediction with an Attention-Based Graph Convolution Network.DeepChemStable：基于注意力图卷积网络的化学稳定性预测。

J Chem Inf Model. 2019 Mar 25;59(3):1044-1049. doi: 10.1021/acs.jcim.8b00672. Epub 2019 Feb 21.

Facilitating prediction of adverse drug reactions by using knowledge graphs and multi-label learning models.利用知识图谱和多标签学习模型促进药物不良反应预测。

Brief Bioinform. 2019 Jan 18;20(1):190-202. doi: 10.1093/bib/bbx099.

Computational prediction of drug-target interactions using chemogenomic approaches: an empirical survey.基于化学生物组学方法的药物-靶标相互作用的计算预测：一项实证调查。

Brief Bioinform. 2019 Jul 19;20(4):1337-1357. doi: 10.1093/bib/bby002.

引用本文的文献

SeEn: Sequential enriched datasets for sequence-aware recommendations.SeEn：用于序列感知推荐的连续丰富数据集。

Sci Data. 2022 Aug 4;9(1):478. doi: 10.1038/s41597-022-01598-7.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于未报告特征为阴性的预测化学信息学假设

On the Unreported-Profile-is-Negative Assumption for Predictive Cheminformatics.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献