Suppr超能文献

针对阿尔茨海默病和其他神经退行性疾病的基因特异性出版物的文本挖掘和门户开发。

Text mining and portal development for gene-specific publications on Alzheimer's disease and other neurodegenerative diseases.

机构信息

Department of BioHealth Informatics, Indiana University School of Informatics & Computing, Indianapolis, IN, 46202, USA.

Health Services Administration & Policy, Temple University College of Public Health, Philadelphia, PA, 19122, USA.

出版信息

BMC Med Inform Decis Mak. 2024 Apr 17;24(Suppl 3):98. doi: 10.1186/s12911-024-02501-7.

Abstract

BACKGROUND

Tremendous research efforts have been made in the Alzheimer's disease (AD) field to understand the disease etiology, progression and discover treatments for AD. Many mechanistic hypotheses, therapeutic targets and treatment strategies have been proposed in the last few decades. Reviewing previous work and staying current on this ever-growing body of AD publications is an essential yet difficult task for AD researchers.

METHODS

In this study, we designed and implemented a natural language processing (NLP) pipeline to extract gene-specific neurodegenerative disease (ND) -focused information from the PubMed database. The collected publication information was filtered and cleaned to construct AD-related gene-specific publication profiles. Six categories of AD-related information are extracted from the processed publication data: publication trend by year, dementia type occurrence, brain region occurrence, mouse model information, keywords occurrence, and co-occurring genes. A user-friendly web portal is then developed using Django framework to provide gene query functions and data visualizations for the generalized and summarized publication information.

RESULTS

By implementing the NLP pipeline, we extracted gene-specific ND-related publication information from the abstracts of the publications in the PubMed database. The results are summarized and visualized through an interactive web query portal. Multiple visualization windows display the ND publication trends, mouse models used, dementia types, involved brain regions, keywords to major AD-related biological processes, and co-occurring genes. Direct links to PubMed sites are provided for all recorded publications on the query result page of the web portal.

CONCLUSION

The resulting portal is a valuable tool and data source for quick querying and displaying AD publications tailored to users' interested research areas and gene targets, which is especially convenient for users without informatic mining skills. Our study will not only keep AD field researchers updated with the progress of AD research, assist them in conducting preliminary examinations efficiently, but also offers additional support for hypothesis generation and validation which will contribute significantly to the communication, dissemination, and progress of AD research.

摘要

背景

在阿尔茨海默病(AD)领域,人们已经投入了大量的研究努力,以了解疾病的病因、进展,并发现 AD 的治疗方法。在过去的几十年中,已经提出了许多机制假说、治疗靶点和治疗策略。对于 AD 研究人员来说,回顾以往的工作并及时了解这一不断增长的 AD 文献是一项必不可少但又具有挑战性的任务。

方法

在这项研究中,我们设计并实施了一个自然语言处理(NLP)管道,从 PubMed 数据库中提取特定基因的神经退行性疾病(ND)相关信息。收集的出版物信息经过过滤和清理,以构建与 AD 相关的基因特异性出版物档案。从处理后的出版物数据中提取了与 AD 相关的六类信息:按年份的出版物趋势、痴呆类型发生、大脑区域发生、小鼠模型信息、关键词出现和共同出现的基因。然后使用 Django 框架开发了一个用户友好的 Web 门户,为一般和总结的出版物信息提供基因查询功能和数据可视化。

结果

通过实施 NLP 管道,我们从 PubMed 数据库出版物的摘要中提取了特定基因的 ND 相关出版物信息。结果通过交互式 Web 查询门户进行总结和可视化。多个可视化窗口显示 ND 出版物趋势、使用的小鼠模型、痴呆类型、涉及的大脑区域、与主要 AD 相关生物过程相关的关键词以及共同出现的基因。所有记录出版物都在 Web 门户的查询结果页面上提供了指向 PubMed 网站的直接链接。

结论

该门户是一个有价值的工具和数据源,可快速查询和显示针对用户感兴趣的研究领域和基因靶标定制的 AD 出版物,这对于没有信息挖掘技能的用户特别方便。我们的研究不仅可以使 AD 领域的研究人员及时了解 AD 研究的进展,帮助他们高效地进行初步检查,还可以为假设生成和验证提供额外的支持,这将对 AD 研究的交流、传播和进展做出重大贡献。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c67c/11025191/f51f1c5ffe74/12911_2024_2501_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验