Suppr超能文献

大规模数据集中化合物的混合语义推荐系统。

Hybrid semantic recommender system for chemical compounds in large-scale datasets.

作者信息

Barros Marcia, Moitinho Andre, Couto Francisco M

机构信息

LASIGE, Departamento de Informática, Faculdade de Ciências, Universidade de Lisboa, 1749-016, Lisboa, Portugal.

CENTRA, Departamento de Física, Faculdade de Ciências, Universidade de Lisboa, 1749-016, Lisboa, Portugal.

出版信息

J Cheminform. 2021 Feb 23;13(1):15. doi: 10.1186/s13321-021-00495-2.

Abstract

The large, and increasing, number of chemical compounds poses challenges to the exploration of such datasets. In this work, we propose the usage of recommender systems to identify compounds of interest to scientific researchers. Our approach consists of a hybrid recommender model suitable for implicit feedback datasets and focused on retrieving a ranked list according to the relevance of the items. The model integrates collaborative-filtering algorithms for implicit feedback (Alternating Least Squares and Bayesian Personalized Ranking) and a new content-based algorithm, using the semantic similarity between the chemical compounds in the ChEBI ontology. The algorithms were assessed on an implicit dataset of chemical compounds, CheRM-20, with more than 16.000 items (chemical compounds). The hybrid model was able to improve the results of the collaborative-filtering algorithms, by more than ten percentage points in most of the assessed evaluation metrics.

摘要

大量且不断增加的化合物给探索此类数据集带来了挑战。在这项工作中,我们建议使用推荐系统来识别科研人员感兴趣的化合物。我们的方法包括一个适用于隐性反馈数据集的混合推荐模型,该模型专注于根据项目的相关性检索一个排名列表。该模型整合了用于隐性反馈的协同过滤算法(交替最小二乘法和贝叶斯个性化排序)以及一种新的基于内容的算法,该算法利用了ChEBI本体中化合物之间的语义相似性。这些算法在一个包含超过16000个项目(化合物)的化合物隐性数据集CheRM - 20上进行了评估。在大多数评估指标中,混合模型能够将协同过滤算法的结果提高超过十个百分点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2e73/7903631/1182e603209d/13321_2021_495_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验