• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从中医文本数据中提取信息:2010年至2021年任务、挑战及方法综述

Information Extraction from the Text Data on Traditional Chinese Medicine: A Review on Tasks, Challenges, and Methods from 2010 to 2021.

作者信息

Zhang Tingting, Huang Zonghai, Wang Yaqiang, Wen Chuanbiao, Peng Yangzhi, Ye Ying

机构信息

Chengdu University of Traditional Chinese Medicine, Chengdu, China.

Chengdu University of Information Technology, Chengdu, China.

出版信息

Evid Based Complement Alternat Med. 2022 May 13;2022:1679589. doi: 10.1155/2022/1679589. eCollection 2022.

DOI:10.1155/2022/1679589
PMID:35600940
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9122692/
Abstract

BACKGROUND

The practice of traditional Chinese medicine (TCM) began several thousand years ago, and the knowledge of practitioners is recorded in paper and electronic versions of case notes, manuscripts, and books in multiple languages. Developing a method of information extraction (IE) from these sources to generate a cohesive data set would be a great contribution to the medical field. The goal of this study was to perform a systematic review of the status of IE from TCM sources over the last 10 years.

METHODS

We conducted a search of four literature databases for articles published from 2010 to 2021 that focused on the use of natural language processing (NLP) methods to extract information from unstructured TCM text data. Two reviewers and one adjudicator contributed to article search, article selection, data extraction, and synthesis processes.

RESULTS

We retrieved 1234 records, 49 of which met our inclusion criteria. We used the articles to (i) assess the key tasks of IE in the TCM domain, (ii) summarize the challenges to extracting information from TCM text data, and (iii) identify effective frameworks, models, and key findings of TCM IE through classification.

CONCLUSIONS

Our analysis showed that IE from TCM text data has improved over the past decade. However, the extraction of TCM text still faces some challenges involving the lack of gold standard corpora, nonstandardized expressions, and multiple types of relations. In the future, IE work should be promoted by extracting more existing entities and relations, constructing gold standard data sets, and exploring IE methods based on a small amount of labeled data. Furthermore, fine-grained and interpretable IE technologies are necessary for further exploration.

摘要

背景

中医实践始于数千年前,从业者的知识记录在纸质和电子版本的病例记录、手稿以及多种语言的书籍中。开发一种从这些来源提取信息(IE)以生成连贯数据集的方法将对医学领域做出巨大贡献。本研究的目的是对过去10年从中医来源进行信息提取的现状进行系统综述。

方法

我们在四个文献数据库中搜索了2010年至2021年发表的文章,这些文章聚焦于使用自然语言处理(NLP)方法从非结构化中医文本数据中提取信息。两名评审员和一名裁决员参与了文章搜索、文章筛选、数据提取和综合过程。

结果

我们检索到1234条记录,其中49条符合我们的纳入标准。我们利用这些文章来(i)评估中医领域信息提取的关键任务,(ii)总结从中医文本数据中提取信息的挑战,以及(iii)通过分类识别中医信息提取的有效框架、模型和关键发现。

结论

我们的分析表明,过去十年中从中医文本数据进行的信息提取有所改进。然而,中医文本的提取仍面临一些挑战,包括缺乏金标准语料库、表达不规范以及多种关系类型。未来,应通过提取更多现有实体和关系、构建金标准数据集以及探索基于少量标注数据的信息提取方法来推动信息提取工作。此外,还需要细粒度且可解释的信息提取技术进行进一步探索。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/856adf3b13f4/ECAM2022-1679589.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/f97e4c3683de/ECAM2022-1679589.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/aecc762c4bb1/ECAM2022-1679589.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/b619cb4492c3/ECAM2022-1679589.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/68e5b5c08aa3/ECAM2022-1679589.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/3b8a392f1d5d/ECAM2022-1679589.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/d9f09eca708f/ECAM2022-1679589.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/6dd687a1da57/ECAM2022-1679589.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/856adf3b13f4/ECAM2022-1679589.008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/f97e4c3683de/ECAM2022-1679589.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/aecc762c4bb1/ECAM2022-1679589.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/b619cb4492c3/ECAM2022-1679589.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/68e5b5c08aa3/ECAM2022-1679589.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/3b8a392f1d5d/ECAM2022-1679589.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/d9f09eca708f/ECAM2022-1679589.006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/6dd687a1da57/ECAM2022-1679589.007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d78/9122692/856adf3b13f4/ECAM2022-1679589.008.jpg

相似文献

1
Information Extraction from the Text Data on Traditional Chinese Medicine: A Review on Tasks, Challenges, and Methods from 2010 to 2021.从中医文本数据中提取信息:2010年至2021年任务、挑战及方法综述
Evid Based Complement Alternat Med. 2022 May 13;2022:1679589. doi: 10.1155/2022/1679589. eCollection 2022.
2
A Multigranularity Text Driven Named Entity Recognition CGAN Model for Traditional Chinese Medicine Literatures.一种基于多粒度文本驱动的中医药文献命名实体识别 CGAN 模型。
Comput Intell Neurosci. 2022 Sep 24;2022:1495841. doi: 10.1155/2022/1495841. eCollection 2022.
3
Beyond the black stump: rapid reviews of health research issues affecting regional, rural and remote Australia.超越黑木树:影响澳大利亚地区、农村和偏远地区的健康研究问题的快速综述。
Med J Aust. 2020 Dec;213 Suppl 11:S3-S32.e1. doi: 10.5694/mja2.50881.
4
Applying Natural Language Processing to Textual Data From Clinical Data Warehouses: Systematic Review.将自然语言处理应用于临床数据仓库中的文本数据:系统评价。
JMIR Med Inform. 2023 Dec 15;11:e42477. doi: 10.2196/42477.
5
Text mining for traditional Chinese medical knowledge discovery: a survey.基于文本挖掘的中医药知识发现研究综述。
J Biomed Inform. 2010 Aug;43(4):650-60. doi: 10.1016/j.jbi.2010.01.002. Epub 2010 Jan 13.
6
[A customized method for information extraction from unstructured text data in the electronic medical records].[一种从电子病历非结构化文本数据中提取信息的定制方法]
Beijing Da Xue Xue Bao Yi Xue Ban. 2018 Apr 18;50(2):256-263.
7
Extraction of temporal relations from clinical free text: A systematic review of current approaches.从临床自由文本中提取时间关系:当前方法的系统评价。
J Biomed Inform. 2020 Aug;108:103488. doi: 10.1016/j.jbi.2020.103488. Epub 2020 Jul 13.
8
Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study.中医自由文本临床记录中症状名称识别的监督方法:一项实证研究。
J Biomed Inform. 2014 Feb;47:91-104. doi: 10.1016/j.jbi.2013.09.008. Epub 2013 Sep 23.
9
Extraction of Information Related to Drug Safety Surveillance From Electronic Health Record Notes: Joint Modeling of Entities and Relations Using Knowledge-Aware Neural Attentive Models.从电子健康记录笔记中提取与药物安全监测相关的信息:使用知识感知神经注意力模型对实体和关系进行联合建模
JMIR Med Inform. 2020 Jul 10;8(7):e18417. doi: 10.2196/18417.
10
Extraction of Traditional Chinese Medicine Entity: Design of a Novel Span-Level Named Entity Recognition Method With Distant Supervision.中药实体提取:一种基于远程监督的新型跨度级命名实体识别方法的设计
JMIR Med Inform. 2021 Jun 14;9(6):e28219. doi: 10.2196/28219.

引用本文的文献

1
TCMKD: From ancient wisdom to modern insights-A comprehensive platform for traditional Chinese medicine knowledge discovery.中医知识发现数据库:从古代智慧到现代洞察——一个全面的中医知识发现平台。
J Pharm Anal. 2025 Jun;15(6):101297. doi: 10.1016/j.jpha.2025.101297. Epub 2025 Apr 10.
2
TCMSF: A Construction Framework of Traditional Chinese Medicine Syndrome Ancient Book Knowledge Graph.中医综合征古籍知识图谱构建框架(TCMSF)
Methods Inf Med. 2024 Dec;63(5-06):183-194. doi: 10.1055/a-2590-6348. Epub 2025 Apr 17.
3
Efficient evidence selection for systematic reviews in traditional Chinese medicine.

本文引用的文献

1
Review on the Application of Metalearning in Artificial Intelligence.元学习在人工智能中的应用综述
Comput Intell Neurosci. 2021 Jul 5;2021:1560972. doi: 10.1155/2021/1560972. eCollection 2021.
2
AI in medicine must be explainable.医学中的人工智能必须是可解释的。
Nat Med. 2021 Aug;27(8):1328. doi: 10.1038/s41591-021-01461-z.
3
Determining the Traditional Chinese Medicine (TCM) Syndrome with the Best Prognosis of HBV-Related HCC and Exploring the Related Mechanism Using Network Pharmacology.基于网络药理学确定乙肝相关肝癌预后最佳的中医证型并探索其相关机制
中医系统评价中的高效证据选择
BMC Med Res Methodol. 2025 Jan 15;25(1):10. doi: 10.1186/s12874-024-02430-z.
4
Automated information extraction model enhancing traditional Chinese medicine RCT evidence extraction (Evi-BERT): algorithm development and validation.增强中医药随机对照试验证据提取的自动化信息提取模型(Evi-BERT):算法开发与验证
Front Artif Intell. 2024 Aug 15;7:1454945. doi: 10.3389/frai.2024.1454945. eCollection 2024.
5
AI-assisted literature exploration of innovative Chinese medicine formulas.人工智能辅助探索创新中药方剂的文献研究
Front Pharmacol. 2024 Mar 22;15:1347882. doi: 10.3389/fphar.2024.1347882. eCollection 2024.
6
CPMI-ChatGLM: parameter-efficient fine-tuning ChatGLM with Chinese patent medicine instructions.CPMI-ChatGLM:基于中药说明书的参数高效微调 ChatGLM。
Sci Rep. 2024 Mar 16;14(1):6403. doi: 10.1038/s41598-024-56874-w.
7
Overview of Chatbots with special emphasis on artificial intelligence-enabled ChatGPT in medical science.聊天机器人概述,特别强调医学领域中基于人工智能的ChatGPT
Front Artif Intell. 2023 Oct 31;6:1237704. doi: 10.3389/frai.2023.1237704. eCollection 2023.
8
Persistent clinical symptoms and their association with CM syndromes in post-COVID-19 rehabilitation patients in Hong Kong.香港新冠康复患者的持续临床症状及其与慢性后遗症综合征的关联
Heliyon. 2023 Aug 25;9(9):e19410. doi: 10.1016/j.heliyon.2023.e19410. eCollection 2023 Sep.
9
RegEMR: a natural language processing system to automatically identify premature ovarian decline from Chinese electronic medical records.RegEMR:一个自然语言处理系统,用于从中文电子病历中自动识别卵巢早衰。
BMC Med Inform Decis Mak. 2023 Jul 18;23(1):126. doi: 10.1186/s12911-023-02239-8.
10
Application of Mathematical Modeling and Computational Tools in the Modern Drug Design and Development Process.数学建模和计算工具在现代药物设计和开发过程中的应用。
Molecules. 2022 Jun 29;27(13):4169. doi: 10.3390/molecules27134169.
Evid Based Complement Alternat Med. 2021 Jun 29;2021:9991533. doi: 10.1155/2021/9991533. eCollection 2021.
4
Extraction of Traditional Chinese Medicine Entity: Design of a Novel Span-Level Named Entity Recognition Method With Distant Supervision.中药实体提取:一种基于远程监督的新型跨度级命名实体识别方法的设计
JMIR Med Inform. 2021 Jun 14;9(6):e28219. doi: 10.2196/28219.
5
Current Policies and Measures on the Development of Traditional Chinese Medicine in China.中国发展中医药的现行政策和措施。
Pharmacol Res. 2021 Jan;163:105187. doi: 10.1016/j.phrs.2020.105187. Epub 2020 Sep 9.
6
Efficacy and safety of herbal medicine (Lianhuaqingwen) for treating COVID-19: A systematic review and meta-analysis.中药(连花清瘟)治疗新型冠状病毒肺炎的疗效和安全性:一项系统评价与Meta分析
Integr Med Res. 2021 Mar;10(1):100644. doi: 10.1016/j.imr.2020.100644. Epub 2020 Aug 21.
7
A semi-supervised approach for extracting TCM clinical terms based on feature words.基于特征词的中医临床术语抽取的半监督方法。
BMC Med Inform Decis Mak. 2020 Jul 9;20(Suppl 3):118. doi: 10.1186/s12911-020-1108-1.
8
Artificial Intelligence-Based Traditional Chinese Medicine Assistive Diagnostic System: Validation Study.基于人工智能的中医辅助诊断系统:验证研究。
JMIR Med Inform. 2020 Jun 15;8(6):e17608. doi: 10.2196/17608.
9
Constructing fine-grained entity recognition corpora based on clinical records of traditional Chinese medicine.基于中医临床记录构建细粒度实体识别语料库。
BMC Med Inform Decis Mak. 2020 Apr 6;20(1):64. doi: 10.1186/s12911-020-1079-2.
10
Quantitative knowledge presentation models of traditional Chinese medicine (TCM): A review.中医(TCM)的定量知识表示模型:综述。
Artif Intell Med. 2020 Mar;103:101810. doi: 10.1016/j.artmed.2020.101810. Epub 2020 Jan 24.