Suppr超能文献

利用生成式人工智能辅助罕见病医疗行为的生物编目。

Leveraging generative AI to assist biocuration of medical actions for rare disease.

作者信息

Niyonkuru Enock, Caufield J Harry, Carmody Leigh C, Gargano Michael A, Toro Sabrina, Whetzel Patricia L, Blau Hannah, Soto Gomez Mauricio, Casiraghi Elena, Chimirri Leonardo, Reese Justin T, Valentini Giorgio, Haendel Melissa A, Mungall Christopher J, Robinson Peter N

机构信息

Trinity College, Hartford, CT 06106, United States.

The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, United States.

出版信息

Bioinform Adv. 2025 Jun 12;5(1):vbaf141. doi: 10.1093/bioadv/vbaf141. eCollection 2025.

Abstract

MOTIVATION

Structured representations of clinical data can support computational analysis of individuals and cohorts, and ontologies representing disease entities and phenotypic abnormalities are now commonly used for translational research. The Medical Action Ontology (MAxO) provides a computational representation of treatments and other actions taken for clinical management. Currently, manual biocuration is used to annotate MAxO terms to rare diseases. However, it is challenging to scale manual curation to comprehensively capture information about medical actions for the more than 10 000 rare diseases.

RESULTS

We present AutoMAxO, a semi-automated workflow that leverages Large Language Models (LLMs) to streamline MAxO biocuration. AutoMAxO first uses LLMs to retrieve candidate curations from abstracts of relevant publications. Next, the candidate curations are matched to ontology terms from MAxO, Human Phenotype Ontology (HPO), and MONDO disease ontology via a combination of LLMs and post-processing techniques. Finally, the matched terms are presented in a structured form to a human curator for approval. We used this approach to process abstracts related to 37 rare genetic diseases and identified 958 novel treatment annotations that were transferred to the MAxO annotation dataset.

AVAILABILITY AND IMPLEMENTATION

AutoMAxO is a Python package freely available at https://github.com/monarch-initiative/automaxo.

摘要

动机

临床数据的结构化表示可以支持对个体和队列的计算分析,并且代表疾病实体和表型异常的本体现在常用于转化研究。医学行动本体(MAxO)提供了用于临床管理的治疗和其他行动的计算表示。目前,人工生物编目用于将MAxO术语注释到罕见病。然而,将人工编目扩展以全面捕获超过10000种罕见病的医疗行动信息具有挑战性。

结果

我们展示了AutoMAxO,这是一种利用大语言模型(LLMs)来简化MAxO生物编目的半自动化工作流程。AutoMAxO首先使用LLMs从相关出版物的摘要中检索候选编目。接下来,通过LLMs和后处理技术的组合,将候选编目与来自MAxO、人类表型本体(HPO)和MONDO疾病本体的本体术语进行匹配。最后,将匹配的术语以结构化形式呈现给人工编目员以供批准。我们使用这种方法处理了与37种罕见遗传病相关的摘要,并识别出958条新的治疗注释,这些注释被转移到MAxO注释数据集中。

可用性和实现方式

AutoMAxO是一个Python包,可在https://github.com/monarch-initiative/automaxo上免费获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f46e/12228962/dcf1e9aa6978/vbaf141f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验