Suppr超能文献

结合文献挖掘和机器学习预测生物医学发现。

Combining Literature Mining and Machine Learning for Predicting Biomedical Discoveries.

机构信息

DRDO-BU Center for Life Sciences, Bharathiar University Campus, Coimbatore, Tamilnadu, India.

Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA.

出版信息

Methods Mol Biol. 2022;2496:123-140. doi: 10.1007/978-1-0716-2305-3_7.

Abstract

The major outcomes and insights of scientific research and clinical study end up in the form of publication or clinical record in an unstructured text format. Due to advancements in biomedical research, the growth of published literature is getting tremendous large in recent years. The scientists and clinical researchers are facing a big challenge to stay current with the knowledge and to extract hidden information from this sheer quantity of millions of published biomedical literature. The potential one-stop automated solution to this problem is biomedical literature mining. One of the long-standing goals in biology is to discover the disease-causing genes and their specific roles in personalized precision medicine and drug repurposing. However, the empirical approaches and clinical affirmation are expensive and time-consuming. In silico approach using text mining to identify the disease causing genes can contribute towards biomarker discovery. This chapter presents a protocol on combining literature mining and machine learning for predicting biomedical discoveries with a special emphasis on gene-disease relation based discovery. The protocol is presented as a literature based discovery (LBD) pipeline for gene-disease based discovery. The protocol includes our web based tools: (1) DNER (Disease Named Entity Recognizer) for disease entity recognition, (2) BCCNER (Bidirectional, Contextual clues Named Entity Tagger) for gene/protein entity recognition, (3) DisGeReExT (Disease-Gene Relation Extractor) for statistically validated results and visualization, and (4) a newly introduced deep learning based method for association discovery. Our proposed deep learning based method can be generalized and applied to other important biomedical discoveries focusing on entities such as drug/chemical, or miRNA.

摘要

科学研究和临床研究的主要结果和见解最终以未结构化文本格式的出版物或临床记录的形式呈现。由于生物医学研究的进步,近年来发表文献的数量呈指数级增长。科学家和临床研究人员面临着一个巨大的挑战,即如何跟上知识的步伐,并从这数以百万计的已发表的生物医学文献中提取隐藏信息。解决这个问题的潜在一站式自动化解决方案是生物医学文献挖掘。生物学的长期目标之一是发现致病基因及其在个性化精准医学和药物再利用中的特定作用。然而,经验方法和临床验证既昂贵又耗时。使用文本挖掘来识别致病基因的计算方法可以为生物标志物的发现做出贡献。本章介绍了一种结合文献挖掘和机器学习来预测生物医学发现的方案,特别强调了基于基因-疾病关系的发现。该方案作为基于文献的发现 (LBD) 管道呈现,用于基于基因-疾病的发现。该方案包括我们的基于网络的工具:(1) DNER(疾病命名实体识别器)用于疾病实体识别,(2) BCCNER(双向、上下文线索命名实体标记器)用于基因/蛋白质实体识别,(3) DisGeReExT(疾病-基因关系提取器)用于统计验证结果和可视化,以及 (4) 新引入的基于深度学习的关联发现方法。我们提出的基于深度学习的方法可以推广并应用于其他重要的生物医学发现,重点关注药物/化学物质或 miRNA 等实体。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验