从大型自由文本集合中提取机器可理解的医学知识的无监督方法。

Unsupervised method for extracting machine understandable medical knowledge from a large free text collection.

作者信息

Xu Rong, Das Amar K, Garber Alan M

机构信息

Center for Biomedical Informatics Research.

出版信息

AMIA Annu Symp Proc. 2009 Nov 14;2009:709-13.

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2815389/

Abstract

Definitions of medical concepts (e.g diseases, drugs) are essential background knowledge for researchers, clinicians and health care consumers. However, the rapid growth of biomedical research requires that such knowledge continually needs updating. To address this problem, we have developed an unsupervised pattern learning approach that extracts disease and drug definitions from automatically structured randomized clinical trial (RCT) abstracts. In addition, each extracted definition is semantically classified without relying on external medical knowledge. When used to identify definitions from 100 manually annotated RCT abstracts, our medical definition knowledge base has precision of 0.97, recall of 0.93, F1 of 0.94 and semantic classification accuracy of 0.96.

摘要

医学概念（如疾病、药物）的定义是研究人员、临床医生和医疗保健消费者必不可少的背景知识。然而，生物医学研究的快速发展要求此类知识需要不断更新。为了解决这个问题，我们开发了一种无监督模式学习方法，该方法从自动结构化的随机临床试验（RCT）摘要中提取疾病和药物定义。此外，每个提取的定义在不依赖外部医学知识的情况下进行语义分类。当用于从100篇人工标注的RCT摘要中识别定义时，我们的医学定义知识库的精确率为0.97，召回率为0.93，F1值为0.94，语义分类准确率为0.96。

相似文献

1

Unsupervised method for extracting machine understandable medical knowledge from a large free text collection.

AMIA Annu Symp Proc. 2009 Nov 14;2009:709-13.

2

Unsupervised method for automatic construction of a disease dictionary from a large free text collection.

AMIA Annu Symp Proc. 2008 Nov 6;2008:820-4.

3

Extracting subject demographic information from abstracts of randomized clinical trial reports.

Stud Health Technol Inform. 2007;129(Pt 1):550-4.

4

Enhancing biomedical text summarization using semantic relation extraction.

PLoS One. 2011;6(8):e23862. doi: 10.1371/journal.pone.0023862. Epub 2011 Aug 26.

5

Semantic role labeling for protein transport predicates.

BMC Bioinformatics. 2008 Jun 11;9:277. doi: 10.1186/1471-2105-9-277.

6

Identification of key concepts in biomedical literature using a modified Markov heuristic.

Bioinformatics. 2003 Feb 12;19(3):402-7. doi: 10.1093/bioinformatics/btg010.

7

Automated information extraction of key trial design elements from clinical trial publications.

AMIA Annu Symp Proc. 2008 Nov 6;2008:141-5.

8

Comparing generative and extractive approaches to information extraction from abstracts describing randomized clinical trials.

J Biomed Semantics. 2024 Apr 23;15(1):3. doi: 10.1186/s13326-024-00305-2.

9

Building a semantically annotated corpus of clinical texts.

J Biomed Inform. 2009 Oct;42(5):950-66. doi: 10.1016/j.jbi.2008.12.013. Epub 2009 Jan 23.

10

Finding the meaning of medical concept correlations.

AMIA Annu Symp Proc. 2008 Nov 6;2008:830-4.

引用本文的文献

1

CoMNRank: An integrated approach to extract and prioritize human microbial metabolites from MEDLINE records.

J Biomed Inform. 2020 Sep;109:103524. doi: 10.1016/j.jbi.2020.103524. Epub 2020 Aug 11.

2

Automatic construction of a large-scale and accurate drug-side-effect association knowledge base from biomedical literature.

J Biomed Inform. 2014 Oct;51:191-9. doi: 10.1016/j.jbi.2014.05.013. Epub 2014 Jun 10.

3

dRiskKB: a large-scale disease-disease risk relationship knowledge base constructed from biomedical text.

BMC Bioinformatics. 2014 Apr 12;15:105. doi: 10.1186/1471-2105-15-105.

4

Dissecting the Ambiguity of FMA Concept Names Using Taxonomy and Partonomy Structural Information.

AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:157-61. eCollection 2013.

5

Towards building a disease-phenotype knowledge base: extracting disease-manifestation relationship from literature.

Bioinformatics. 2013 Sep 1;29(17):2186-94. doi: 10.1093/bioinformatics/btt359. Epub 2013 Jul 4.

6

A semi-supervised approach to extract pharmacogenomics-specific drug-gene pairs from biomedical literature for personalized medicine.

J Biomed Inform. 2013 Aug;46(4):585-93. doi: 10.1016/j.jbi.2013.04.001. Epub 2013 Apr 6.

7

An iterative searching and ranking algorithm for prioritising pharmacogenomics genes.

Int J Comput Biol Drug Des. 2013;6(1-2):18-31. doi: 10.1504/IJCBDD.2013.052199. Epub 2013 Feb 21.

8

A Comprehensive Analysis of Five Million UMLS Metathesaurus Terms Using Eighteen Million MEDLINE Citations.

AMIA Annu Symp Proc. 2010 Nov 13;2010:907-11.

本文引用的文献

1

Unsupervised method for automatic construction of a disease dictionary from a large free text collection.

AMIA Annu Symp Proc. 2008 Nov 6;2008:820-4.

2

Development, implementation, and a cognitive evaluation of a definitional question answering system for physicians.

J Biomed Inform. 2007 Jun;40(3):236-51. doi: 10.1016/j.jbi.2007.03.002. Epub 2007 Mar 12.

3

Semantic classification of biomedical concepts using distributional similarity.

J Am Med Inform Assoc. 2007 Jul-Aug;14(4):467-77. doi: 10.1197/jamia.M2314. Epub 2007 Apr 25.

4

Combining text classification and Hidden Markov Modeling techniques for categorizing sentences in randomized clinical trial abstracts.

AMIA Annu Symp Proc. 2006;2006:824-8.

5

Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program.

Proc AMIA Symp. 2001:17-21.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。