Suppr超能文献

DILI:一种用于检索药物性肝损伤文献的基于人工智能的分类器。

DILI : An AI-Based Classifier to Search for Drug-Induced Liver Injury Literature.

作者信息

Rathee Sanjay, MacMahon Meabh, Liu Anika, Katritsis Nicholas M, Youssef Gehad, Hwang Woochang, Wollman Lilly, Han Namshik

机构信息

Milner Therapeutics Institute, University of Cambridge, Cambridge, United Kingdom.

LifeArc, Stevenage, United Kingdom.

出版信息

Front Genet. 2022 Jun 29;13:867946. doi: 10.3389/fgene.2022.867946. eCollection 2022.

Abstract

Drug-induced liver injury (DILI) is a class of adverse drug reactions (ADR) that causes problems in both clinical and research settings. It is the most frequent cause of acute liver failure in the majority of Western countries and is a major cause of attrition of novel drug candidates. Manual trawling of the literature is the main route of deriving information on DILI from research studies. This makes it an inefficient process prone to human error. Therefore, an automatized AI model capable of retrieving DILI-related articles from the huge ocean of literature could be invaluable for the drug discovery community. In this study, we built an artificial intelligence (AI) model combining the power of natural language processing (NLP) and machine learning (ML) to address this problem. This model uses NLP to filter out meaningless text (e.g., stop words) and uses customized functions to extract relevant keywords such as singleton, pair, and triplet. These keywords are processed by an apriori pattern mining algorithm to extract relevant patterns which are used to estimate initial weightings for a ML classifier. Along with pattern importance and frequency, an FDA-approved drug list mentioning DILI adds extra confidence in classification. The combined power of these methods builds a DILI classifier (DILI ), with 94.91% cross-validation and 94.14% external validation accuracy. To make DILI as accessible as possible, including to researchers without coding experience, an R Shiny app capable of classifying single or multiple entries for DILI is developed to enhance ease of user experience and made available at https://researchmind.co.uk/diliclassifier/. Additionally, a GitHub link (https://github.com/sanjaysinghrathi/DILI-Classifier) for app source code and ISMB extended video talk (https://www.youtube.com/watch?v=j305yIVi_f8) are available as supplementary materials.

摘要

药物性肝损伤(DILI)是一类在临床和研究环境中均会引发问题的药物不良反应(ADR)。在大多数西方国家,它是急性肝衰竭最常见的病因,也是新型药物候选物淘汰的主要原因。人工查阅文献是从研究中获取DILI信息的主要途径。这使得该过程效率低下且容易出现人为错误。因此,一个能够从海量文献中检索出DILI相关文章的自动化人工智能模型对于药物研发领域可能具有极高的价值。在本研究中,我们构建了一个结合自然语言处理(NLP)和机器学习(ML)能力的人工智能模型来解决这一问题。该模型使用NLP过滤掉无意义的文本(如停用词),并使用定制函数提取相关关键词,如单字、双字和三字词。这些关键词通过先验模式挖掘算法进行处理,以提取相关模式,用于估计ML分类器的初始权重。除了模式的重要性和出现频率外,一份提及DILI的FDA批准药物清单增加了分类的额外可信度。这些方法的综合力量构建了一个DILI分类器(DILI ),其交叉验证准确率为94.91%,外部验证准确率为94.14%。为了使DILI 尽可能易于使用,包括方便没有编码经验的研究人员使用,我们开发了一个能够对DILI的单个或多个条目进行分类的R Shiny应用程序,以提高用户体验的便捷性,并可在https://researchmind.co.uk/diliclassifier/获取。此外,作为补充材料,还提供了应用程序源代码的GitHub链接(https://github.com/sanjaysinghrathi/DILI-Classifier)和ISMB扩展视频讲座(https://www.youtube.com/watch?v=j305yIVi_f8)。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/85f7/9277181/2eda065abfaa/fgene-13-867946-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验