• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

中医临床记录中的自动症状名称规范化。

Automatic symptom name normalization in clinical records of traditional Chinese medicine.

机构信息

Department of Computer Science, Sichuan University, Chengdu, Sichuan, PR China.

出版信息

BMC Bioinformatics. 2010 Jan 20;11:40. doi: 10.1186/1471-2105-11-40.

DOI:10.1186/1471-2105-11-40
PMID:20089162
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3098075/
Abstract

BACKGROUND

In recent years, Data Mining technology has been applied more than ever before in the field of traditional Chinese medicine (TCM) to discover regularities from the experience accumulated in the past thousands of years in China. Electronic medical records (or clinical records) of TCM, containing larger amount of information than well-structured data of prescriptions extracted manually from TCM literature such as information related to medical treatment process, could be an important source for discovering valuable regularities of TCM. However, they are collected by TCM doctors on a day to day basis without the support of authoritative editorial board, and owing to different experience and background of TCM doctors, the same concept might be described in several different terms. Therefore, clinical records of TCM cannot be used directly to Data Mining and Knowledge Discovery. This paper focuses its attention on the phenomena of "one symptom with different names" and investigates a series of metrics for automatically normalizing symptom names in clinical records of TCM.

RESULTS

A series of extensive experiments were performed to validate the metrics proposed, and they have shown that the hybrid similarity metrics integrating literal similarity and remedy-based similarity are more accurate than the others which are based on literal similarity or remedy-based similarity alone, and the highest F-Measure (65.62%) of all the metrics is achieved by hybrid similarity metric VSM+TFIDF+SWD.

CONCLUSIONS

Automatic symptom name normalization is an essential task for discovering knowledge from clinical data of TCM. The problem is introduced for the first time by this paper. The results have verified that the investigated metrics are reasonable and accurate, and the hybrid similarity metrics are much better than the metrics based on literal similarity or remedy-based similarity alone.

摘要

背景

近年来,数据挖掘技术在中医领域的应用比以往任何时候都更加广泛,旨在从中国过去几千年积累的经验中发现规律。中医的电子病历(或临床记录)包含的信息量比从中医文献中手动提取的处方等结构化数据多,可能是发现中医有价值规律的重要来源。然而,它们是由中医医生在日常工作中收集的,没有得到权威编辑委员会的支持,由于中医医生的经验和背景不同,同一个概念可能会用几个不同的术语来描述。因此,中医的临床记录不能直接用于数据挖掘和知识发现。本文关注“一症多名”现象,并研究了一系列自动规范中医临床记录中症状名称的指标。

结果

进行了一系列广泛的实验来验证所提出的指标,结果表明,整合字面相似性和基于治疗相似性的混合相似性指标比仅基于字面相似性或基于治疗相似性的指标更准确,所有指标中最高的 F 度量(65.62%)是由混合相似性指标 VSM+TFIDF+SWD 实现的。

结论

自动症状名称规范化是从中医临床数据中发现知识的必要任务。本文首次提出了这个问题。结果验证了所研究的指标是合理和准确的,混合相似性指标明显优于仅基于字面相似性或基于治疗相似性的指标。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/8f37de9c21d8/1471-2105-11-40-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/63465bf718ad/1471-2105-11-40-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/e983f429118a/1471-2105-11-40-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/79258a8e3540/1471-2105-11-40-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/79e33188e903/1471-2105-11-40-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/d2e0692fec0b/1471-2105-11-40-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/795ed1d587f7/1471-2105-11-40-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/c7a22d8dbcfe/1471-2105-11-40-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/8f37de9c21d8/1471-2105-11-40-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/63465bf718ad/1471-2105-11-40-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/e983f429118a/1471-2105-11-40-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/79258a8e3540/1471-2105-11-40-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/79e33188e903/1471-2105-11-40-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/d2e0692fec0b/1471-2105-11-40-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/795ed1d587f7/1471-2105-11-40-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/c7a22d8dbcfe/1471-2105-11-40-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3114/3098075/8f37de9c21d8/1471-2105-11-40-8.jpg

相似文献

1
Automatic symptom name normalization in clinical records of traditional Chinese medicine.中医临床记录中的自动症状名称规范化。
BMC Bioinformatics. 2010 Jan 20;11:40. doi: 10.1186/1471-2105-11-40.
2
Supervised methods for symptom name recognition in free-text clinical records of traditional Chinese medicine: an empirical study.中医自由文本临床记录中症状名称识别的监督方法:一项实证研究。
J Biomed Inform. 2014 Feb;47:91-104. doi: 10.1016/j.jbi.2013.09.008. Epub 2013 Sep 23.
3
A framework and its empirical study of automatic diagnosis of traditional Chinese medicine utilizing raw free-text clinical records.利用原始自由文本临床记录的中医自动诊断的框架及其实证研究。
J Biomed Inform. 2012 Apr;45(2):210-23. doi: 10.1016/j.jbi.2011.10.010. Epub 2011 Nov 10.
4
The characteristics and key issues in electronic medical records (EMR) of traditional Chinese medicine TCM.中医电子病历的特点与关键问题
Stud Health Technol Inform. 2013;192:976.
5
Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine.用于规范化中医同义症状表达的自然语言处理算法
Evid Based Complement Alternat Med. 2021 Oct 11;2021:6676607. doi: 10.1155/2021/6676607. eCollection 2021.
6
Development of traditional Chinese medicine clinical data warehouse for medical knowledge discovery and decision support.中医药临床数据仓库的开发用于医学知识发现和决策支持。
Artif Intell Med. 2010 Feb-Mar;48(2-3):139-52. doi: 10.1016/j.artmed.2009.07.012. Epub 2010 Feb 1.
7
TCMGeneDIT: a database for associated traditional Chinese medicine, gene and disease information using text mining.TCMGeneDIT:一个利用文本挖掘技术整合中医、基因和疾病信息的数据库。
BMC Complement Altern Med. 2008 Oct 14;8:58. doi: 10.1186/1472-6882-8-58.
8
Integrative mining of traditional Chinese medicine literature and MEDLINE for functional gene networks.整合挖掘中医文献与医学文献在线数据库以构建功能基因网络
Artif Intell Med. 2007 Oct;41(2):87-104. doi: 10.1016/j.artmed.2007.07.007. Epub 2007 Sep 5.
9
Heterogeneous information network based clustering for precision traditional Chinese medicine.基于异质信息网络的精准中医聚类。
BMC Med Inform Decis Mak. 2019 Dec 19;19(Suppl 6):264. doi: 10.1186/s12911-019-0963-0.
10
Knowledge discovery in traditional Chinese medicine: state of the art and perspectives.中医知识发现:现状与展望
Artif Intell Med. 2006 Nov;38(3):219-36. doi: 10.1016/j.artmed.2006.07.005. Epub 2006 Aug 22.

引用本文的文献

1
Identification of Gender Differences in Acute Myocardial Infarction Presentation and Management at Aga Khan University Hospital-Pakistan: Natural Language Processing Application in a Dataset of Patients With Cardiovascular Disease.巴基斯坦阿迦汗大学医院急性心肌梗死表现与治疗中的性别差异识别:心血管疾病患者数据集中的自然语言处理应用
JMIR Form Res. 2024 Dec 20;8:e42774. doi: 10.2196/42774.
2
Natural Language Processing Algorithms for Normalizing Expressions of Synonymous Symptoms in Traditional Chinese Medicine.用于规范化中医同义症状表达的自然语言处理算法
Evid Based Complement Alternat Med. 2021 Oct 11;2021:6676607. doi: 10.1155/2021/6676607. eCollection 2021.
3

本文引用的文献

1
Knowledge discovery in traditional Chinese medicine: state of the art and perspectives.中医知识发现:现状与展望
Artif Intell Med. 2006 Nov;38(3):219-36. doi: 10.1016/j.artmed.2006.07.005. Epub 2006 Aug 22.
2
Investigating semantic similarity measures across the Gene Ontology: the relationship between sequence and annotation.研究基因本体中语义相似性度量:序列与注释之间的关系。
Bioinformatics. 2003 Jul 1;19(10):1275-83. doi: 10.1093/bioinformatics/btg153.
3
Identification of common molecular subsequences.常见分子子序列的鉴定
Efficacy-specific herbal group detection from traditional Chinese medicine prescriptions via hierarchical attentive neural network model.
基于分层注意神经网络模型的中医方剂功效专药组检测。
BMC Med Inform Decis Mak. 2021 Feb 18;21(1):66. doi: 10.1186/s12911-021-01411-2.
4
Prescription Function Prediction Using Topic Model and Multilabel Classifiers.使用主题模型和多标签分类器进行处方功能预测
Evid Based Complement Alternat Med. 2017;2017:8279109. doi: 10.1155/2017/8279109. Epub 2017 Oct 11.
5
A Novel Approach towards Medical Entity Recognition in Chinese Clinical Text.中文临床文本中医疗实体识别的新方法。
J Healthc Eng. 2017;2017:4898963. doi: 10.1155/2017/4898963. Epub 2017 Jul 5.
6
Developing a cardiovascular disease risk factor annotated corpus of Chinese electronic medical records.开发具有心血管疾病风险因素注释的中文电子病历语料库。
BMC Med Inform Decis Mak. 2017 Aug 8;17(1):117. doi: 10.1186/s12911-017-0512-7.
7
Network-based drug discovery by integrating systems biology and computational technologies.基于网络的药物发现:整合系统生物学与计算技术。
Brief Bioinform. 2013 Jul;14(4):491-505. doi: 10.1093/bib/bbs043. Epub 2012 Aug 9.
J Mol Biol. 1981 Mar 25;147(1):195-7. doi: 10.1016/0022-2836(81)90087-5.
4
An improved algorithm for matching biological sequences.一种用于匹配生物序列的改进算法。
J Mol Biol. 1982 Dec 15;162(3):705-8. doi: 10.1016/0022-2836(82)90398-9.