Suppr超能文献

基于科学证据的罕见病研究发现,知识图谱中有研究资金数据。

Scientific evidence based rare disease research discovery with research funding data in knowledge graph.

机构信息

Division of Pre-Clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD, 20850, USA.

ICF International Inc, Rockville, MD, USA.

出版信息

Orphanet J Rare Dis. 2021 Nov 18;16(1):483. doi: 10.1186/s13023-021-02120-9.

Abstract

BACKGROUND

Limited knowledge and unclear underlying biology of many rare diseases pose significant challenges to patients, clinicians, and scientists. To address these challenges, there is an urgent need to inspire and encourage scientists to propose and pursue innovative research studies that aim to uncover the genetic and molecular causes of more rare diseases and ultimately to identify effective therapeutic solutions. A clear understanding of current research efforts, knowledge/research gaps, and funding patterns as scientific evidence is crucial to systematically accelerate the pace of research discovery in rare diseases, which is an overarching goal of this study.

METHODS

To semantically represent NIH funding data for rare diseases and advance its use of effectively promoting rare disease research, we identified NIH funded projects for rare diseases by mapping GARD diseases to the project based on project titles; subsequently we presented and managed those identified projects in a knowledge graph using Neo4j software, hosted at NCATS, based on a pre-defined data model that captures semantics among the data. With this developed knowledge graph, we were able to perform several case studies to demonstrate scientific evidence generation for supporting rare disease research discovery.

RESULTS

Of 5001 rare diseases belonging to 32 distinct disease categories, we identified 1294 diseases that are mapped to 45,647 distinct, NIH-funded projects obtained from the NIH ExPORTER by implementing semantic annotation of project titles. To capture semantic relationships presenting amongst mapped research funding data, we defined a data model comprised of seven primary classes and corresponding object and data properties. A Neo4j knowledge graph based on this predefined data model has been developed, and we performed multiple case studies over this knowledge graph to demonstrate its use in directing and promoting rare disease research.

CONCLUSION

We developed an integrative knowledge graph with rare disease funding data and demonstrated its use as a source from where we can effectively identify and generate scientific evidence to support rare disease research. With the success of this preliminary study, we plan to implement advanced computational approaches for analyzing more funding related data, e.g., project abstracts and PubMed article abstracts, and linking to other types of biomedical data to perform more sophisticated research gap analysis and identify opportunities for future research in rare diseases.

摘要

背景

许多罕见病的知识有限,其潜在生物学机制也不明确,这给患者、临床医生和科学家带来了巨大的挑战。为了应对这些挑战,我们迫切需要激励和鼓励科学家提出并开展创新性研究,以揭示更多罕见病的遗传和分子病因,并最终找到有效的治疗方法。深入了解当前的研究工作、知识/研究空白和资助模式等科学依据,对于系统地加快罕见病研究发现的步伐至关重要,这也是本研究的总体目标。

方法

为了语义表示 NIH 资助罕见病研究的数据,并有效地促进罕见病研究,我们通过基于项目标题将 GARD 疾病映射到项目来确定 NIH 资助的罕见病项目;随后,我们使用基于预定义数据模型的 Neo4j 软件在 NCATS 上的知识图谱中呈现和管理这些已识别的项目,该数据模型捕获了数据之间的语义关系。通过这个开发的知识图谱,我们能够进行多个案例研究,以展示支持罕见病研究发现的科学证据生成。

结果

在属于 32 种不同疾病类别的 5001 种罕见疾病中,我们通过实施项目标题的语义注释,从 NIH ExPORTER 中确定了 1294 种映射到 45647 个不同的 NIH 资助项目的疾病。为了捕捉呈现映射研究资助数据之间的语义关系,我们定义了一个由七个主要类及其相应的对象和数据属性组成的数据模型。基于这个预定义的数据模型,我们开发了一个 Neo4j 知识图谱,并在这个知识图谱上进行了多个案例研究,以展示其在指导和促进罕见病研究方面的应用。

结论

我们开发了一个整合的罕见病资助数据知识图谱,并展示了其作为一个有效识别和生成支持罕见病研究的科学证据的来源的用途。通过这项初步研究的成功,我们计划实施先进的计算方法来分析更多的资助相关数据,例如项目摘要和 PubMed 文章摘要,并与其他类型的生物医学数据链接,以进行更复杂的研究空白分析,并确定罕见病未来研究的机会。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/37e4/8600882/d3ca42c2fb53/13023_2021_2120_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验