Human Language Technology Research Institute, University of Texas at Dallas, Richardson, Texas 75083-0688, USA.
J Am Med Inform Assoc. 2011 Sep-Oct;18(5):594-600. doi: 10.1136/amiajnl-2011-000153.
A supervised machine learning approach to discover relations between medical problems, treatments, and tests mentioned in electronic medical records.
A single support vector machine classifier was used to identify relations between concepts and to assign their semantic type. Several resources such as Wikipedia, WordNet, General Inquirer, and a relation similarity metric inform the classifier.
The techniques reported in this paper were evaluated in the 2010 i2b2 Challenge and obtained the highest F1 score for the relation extraction task. When gold standard data for concepts and assertions were available, F1 was 73.7, precision was 72.0, and recall was 75.3. F1 is defined as 2PrecisionRecall/(Precision+Recall). Alternatively, when concepts and assertions were discovered automatically, F1 was 48.4, precision was 57.6, and recall was 41.7.
Although a rich set of features was developed for the classifiers presented in this paper, little knowledge mining was performed from medical ontologies such as those found in UMLS. Future studies should incorporate features extracted from such knowledge sources, which we expect to further improve the results. Moreover, each relation discovery was treated independently. Joint classification of relations may further improve the quality of results. Also, joint learning of the discovery of concepts, assertions, and relations may also improve the results of automatic relation extraction.
Lexical and contextual features proved to be very important in relation extraction from medical texts. When they are not available to the classifier, the F1 score decreases by 3.7%. In addition, features based on similarity contribute to a decrease of 1.1% when they are not available.
采用有监督机器学习方法,发现电子病历中提到的医疗问题、治疗方法和检测之间的关系。
采用单支持向量机分类器来识别概念之间的关系,并为其分配语义类型。该分类器使用了 Wikipedia、WordNet、General Inquirer 等多种资源以及关系相似性度量标准。
本文报道的技术在 2010 年 i2b2 挑战赛中进行了评估,在关系提取任务中获得了最高的 F1 分数。当有概念和断言的黄金标准数据时,F1 为 73.7,精度为 72.0,召回率为 75.3。F1 的定义为 2PrecisionRecall/(Precision+Recall)。或者,当自动发现概念和断言时,F1 为 48.4,精度为 57.6,召回率为 41.7。
尽管为本文提出的分类器开发了丰富的特征集,但从 UMLS 等医学本体中进行的知识挖掘很少。未来的研究应纳入从这些知识源中提取的特征,我们预计这将进一步提高结果。此外,每个关系发现都是独立处理的。关系的联合分类可能会进一步提高结果的质量。此外,概念、断言和关系的发现的联合学习也可能会提高自动关系提取的结果。
词汇和上下文特征在从医学文本中提取关系时非常重要。当分类器无法获得这些特征时,F1 分数会降低 3.7%。此外,当无法获得基于相似性的特征时,F1 分数会降低 1.1%。