Suppr超能文献

基于转换器的神经网络从文本数据中提取的 miRNA-疾病关系数据集。

Dataset of miRNA-disease relations extracted from textual data using transformer-based neural networks.

机构信息

Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), Schloss Birlinghoven, 53757 Sankt Augustin, Germany.

Knowledge Management, German National Library of Medicine (ZB MED)-Information Centre for Life Sciences, Friedrich-Hirzebruch-Allee 4, Bonn 53115, Germany.

出版信息

Database (Oxford). 2024 Aug 5;2024. doi: 10.1093/database/baae066.

Abstract

MicroRNAs (miRNAs) play important roles in post-transcriptional processes and regulate major cellular functions. The abnormal regulation of expression of miRNAs has been linked to numerous human diseases such as respiratory diseases, cancer, and neurodegenerative diseases. Latest miRNA-disease associations are predominantly found in unstructured biomedical literature. Retrieving these associations manually can be cumbersome and time-consuming due to the continuously expanding number of publications. We propose a deep learning-based text mining approach that extracts normalized miRNA-disease associations from biomedical literature. To train the deep learning models, we build a new training corpus that is extended by distant supervision utilizing multiple external databases. A quantitative evaluation shows that the workflow achieves an area under receiver operator characteristic curve of 98% on a holdout test set for the detection of miRNA-disease associations. We demonstrate the applicability of the approach by extracting new miRNA-disease associations from biomedical literature (PubMed and PubMed Central). We have shown through quantitative analysis and evaluation on three different neurodegenerative diseases that our approach can effectively extract miRNA-disease associations not yet available in public databases. Database URL: https://zenodo.org/records/10523046.

摘要

微小 RNA(miRNAs)在后转录过程中发挥重要作用,并调节主要的细胞功能。miRNAs 表达的异常调节与许多人类疾病有关,如呼吸道疾病、癌症和神经退行性疾病。最新的 miRNA-疾病关联主要存在于无结构的生物医学文献中。由于出版物数量的不断增加,手动检索这些关联可能既繁琐又耗时。我们提出了一种基于深度学习的文本挖掘方法,可从生物医学文献中提取标准化的 miRNA-疾病关联。为了训练深度学习模型,我们构建了一个新的训练语料库,该语料库通过利用多个外部数据库进行远程监督来扩展。定量评估表明,该工作流程在保留测试集上检测 miRNA-疾病关联的接收者操作特征曲线下面积达到 98%。我们通过从生物医学文献(PubMed 和 PubMed Central)中提取新的 miRNA-疾病关联来证明该方法的适用性。通过对三种不同的神经退行性疾病进行定量分析和评估,我们表明我们的方法可以有效地提取尚未在公共数据库中提供的 miRNA-疾病关联。数据库 URL:https://zenodo.org/records/10523046。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c83/11300841/b8c608315e20/baae066f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验