• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ERNIE-UIE:推进中文医学知识图谱中的信息提取

ERNIE-UIE: Advancing information extraction in Chinese medical knowledge graph.

作者信息

Li Bei, Li Changbiao, Sun Jianwei, Zeng Xu, Chen Xiaofan, Zheng Jing

机构信息

Department of Biomedical Informatics, School of Life Science, Central South University, Changsha, Hunan, China.

Shenzhen Health Development Research and Data Management Center, Shenzhen, Guangdong, China.

出版信息

PLoS One. 2025 May 29;20(5):e0325082. doi: 10.1371/journal.pone.0325082. eCollection 2025.

DOI:10.1371/journal.pone.0325082
PMID:40440330
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12121792/
Abstract

BACKGROUND

The field of information extraction (IE) is currently exploring more versatile and efficient methods for minimization of reliance on extensive annotated datasets and integration of knowledge across tasks and domains.

OBJECTIVE

We aim to evaluate and refine the application of the universal IE (UIE) technology in the field of Chinese medical expertise in terms of processing accuracy and efficiency.

METHODS

Our model integrates ontology modeling, web scraping, UIE, fine-tuning strategies, and graph databases, thereby covering knowledge modeling, extraction, and storage techniques. The Enhanced Representation through Knowledge Integration-UIE (ERNIE-UIE) model is fine-tuned and optimized using a small amount of annotated data. A medical knowledge graph is then constructed, followed by validating the graph and conducting knowledge mining on the data stored within it.

RESULTS

Incorporating the characteristics of whole-course management, we implemented a comprehensive medical knowledge graph-construction model and methodology. Entities and relationships were jointly extracted using the pretrained language model, resulting in 8,525 entity data points and 9,522 triple data points. The accuracy of the knowledge graph was verified using graph algorithms.

CONCLUSION

We optimized the construction process of a Chinese medical knowledge graph with minimal annotated data by utilizing a generative extraction paradigm, validating the graph's efficacy and achieving commendable results. This approach addresses the challenge of insufficient annotated training corpora in low-resource knowledge graph construction, thereby contributing to cost savings in the development of knowledge graphs.

摘要

背景

信息提取(IE)领域目前正在探索更通用、高效的方法,以尽量减少对大量标注数据集的依赖,并实现跨任务和领域的知识整合。

目的

我们旨在从处理准确性和效率方面评估和优化通用信息提取(UIE)技术在中国医学专业领域的应用。

方法

我们的模型整合了本体建模、网络爬虫、UIE、微调策略和图数据库,从而涵盖了知识建模、提取和存储技术。通过知识整合增强表示-通用信息提取(ERNIE-UIE)模型使用少量标注数据进行微调与优化。随后构建医学知识图谱,接着对图谱进行验证并对存储在其中的数据进行知识挖掘。

结果

结合全程管理的特点,我们实现了一个全面的医学知识图谱构建模型和方法。利用预训练语言模型联合提取实体和关系,得到8525个实体数据点和9522个三元组数据点。使用图算法验证了知识图谱的准确性。

结论

我们利用生成式提取范式,以最少的标注数据优化了中医知识图谱的构建过程,验证了图谱的有效性并取得了值得称赞的结果。该方法解决了低资源知识图谱构建中标注训练语料不足的挑战,从而有助于节省知识图谱开发成本。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/2998de835856/pone.0325082.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/6fae1d98ea4d/pone.0325082.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/bd296b3e35b1/pone.0325082.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/8849a30f175d/pone.0325082.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/2def4138c1b9/pone.0325082.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/5499afbc8753/pone.0325082.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/cf19cede8f21/pone.0325082.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/0b4a9fbda841/pone.0325082.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/64ed4d250bcf/pone.0325082.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/1b99fbbe649e/pone.0325082.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/4bc135ac64f1/pone.0325082.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/2998de835856/pone.0325082.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/6fae1d98ea4d/pone.0325082.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/bd296b3e35b1/pone.0325082.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/8849a30f175d/pone.0325082.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/2def4138c1b9/pone.0325082.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/5499afbc8753/pone.0325082.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/cf19cede8f21/pone.0325082.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/0b4a9fbda841/pone.0325082.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/64ed4d250bcf/pone.0325082.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/1b99fbbe649e/pone.0325082.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/4bc135ac64f1/pone.0325082.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a67/12121792/2998de835856/pone.0325082.g011.jpg

相似文献

1
ERNIE-UIE: Advancing information extraction in Chinese medical knowledge graph.ERNIE-UIE:推进中文医学知识图谱中的信息提取
PLoS One. 2025 May 29;20(5):e0325082. doi: 10.1371/journal.pone.0325082. eCollection 2025.
2
Automatic knowledge extraction from Chinese electronic medical records and rheumatoid arthritis knowledge graph construction.从中国电子病历中自动提取知识并构建类风湿性关节炎知识图谱。
Quant Imaging Med Surg. 2023 Jun 1;13(6):3873-3890. doi: 10.21037/qims-22-1158. Epub 2023 May 8.
3
An Automatic and End-to-End System for Rare Disease Knowledge Graph Construction Based on Ontology-Enhanced Large Language Models: Development Study.基于本体增强大语言模型的罕见病知识图谱构建自动端到端系统:开发研究
JMIR Med Inform. 2024 Dec 18;12:e60665. doi: 10.2196/60665.
4
Prompt Framework for Extracting Scale-Related Knowledge Entities from Chinese Medical Literature: Development and Evaluation Study.从中医文献中提取量表相关知识实体的提示框架:开发与评估研究
J Med Internet Res. 2025 Mar 18;27:e67033. doi: 10.2196/67033.
5
TCMSF: A Construction Framework of Traditional Chinese Medicine Syndrome Ancient Book Knowledge Graph.中医综合征古籍知识图谱构建框架(TCMSF)
Methods Inf Med. 2024 Dec;63(5-06):183-194. doi: 10.1055/a-2590-6348. Epub 2025 Apr 17.
6
Chinese Clinical Named Entity Recognition From Electronic Medical Records Based on Multisemantic Features by Using Robustly Optimized Bidirectional Encoder Representation From Transformers Pretraining Approach Whole Word Masking and Convolutional Neural Networks: Model Development and Validation.基于多语义特征,利用经过稳健优化的基于变换器预训练方法的全词掩码和卷积神经网络从电子病历中进行中文临床命名实体识别:模型开发与验证
JMIR Med Inform. 2023 May 10;11:e44597. doi: 10.2196/44597.
7
Large Language Model-Driven Knowledge Graph Construction in Sepsis Care Using Multicenter Clinical Databases: Development and Usability Study.使用多中心临床数据库构建用于脓毒症护理的大语言模型驱动的知识图谱:开发与可用性研究
J Med Internet Res. 2025 Mar 27;27:e65537. doi: 10.2196/65537.
8
Automatic extraction of protein-protein interactions using grammatical relationship graph.基于语法关系图自动提取蛋白质相互作用。
BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):42. doi: 10.1186/s12911-018-0628-4.
9
A Chinese Knowledge Graph Dataset in the Field of Scientific Fitness.一个科学健身领域的中文知识图谱数据集。
Sci Data. 2025 Feb 4;12(1):205. doi: 10.1038/s41597-025-04519-6.
10
ARCH: Large-scale knowledge graph via aggregated narrative codified health records analysis.ARCH:通过汇总叙述性编码健康记录分析构建大规模知识图谱
J Biomed Inform. 2025 Feb;162:104761. doi: 10.1016/j.jbi.2024.104761. Epub 2025 Jan 23.

本文引用的文献

1
StaRS: Learning a Stable Representation Space for Continual Relation Classification.
IEEE Trans Neural Netw Learn Syst. 2025 May;36(5):9670-9683. doi: 10.1109/TNNLS.2024.3442236. Epub 2025 May 2.
2
Structured information extraction from scientific text with large language models.利用大语言模型从科学文本中提取结构化信息。
Nat Commun. 2024 Feb 15;15(1):1418. doi: 10.1038/s41467-024-45563-x.
3
Revisiting Relation Extraction in the era of Large Language Models.重访大语言模型时代的关系抽取
Proc Conf Assoc Comput Linguist Meet. 2023 Jul;2023:15566-15589. doi: 10.18653/v1/2023.acl-long.868.
4
Multiple features for clinical relation extraction: A machine learning approach.临床关系抽取的多特征:一种机器学习方法。
J Biomed Inform. 2020 Mar;103:103382. doi: 10.1016/j.jbi.2020.103382. Epub 2020 Feb 3.
5
Disorder recognition in clinical texts using multi-label structured SVM.使用多标签结构化支持向量机识别临床文本中的病症
BMC Bioinformatics. 2017 Jan 31;18(1):75. doi: 10.1186/s12859-017-1476-4.