Rosales Rómer, Farooq Faisal, Krishnapuram Balaji, Yu Shipeng, Fung Glenn
Knowledge Solutions, Siemens Healthcare. Malvern, PA USA.
AMIA Annu Symp Proc. 2010 Nov 13;2010:682-6.
This paper describes a machine learning, text processing approach that allows the extraction of key medical information from unstructured text in Electronic Medical Records. The approach utilizes a novel text representation that shares the simplicity of the widely used bag-of-words representation, but can also represent some form of semantic information in the text. The large dimensionality of this type of learning models is controlled by the use of a ℓ(1) regularization to favor parsimonious models. Experimental results demonstrate the accuracy of the approach in extracting medical assertions that can be associated to polarity and relevance detection.
本文描述了一种机器学习文本处理方法,该方法能够从电子病历中的非结构化文本中提取关键医学信息。该方法采用了一种新颖的文本表示方式,它兼具广泛使用的词袋表示法的简单性,同时还能表示文本中的某种语义信息。这类学习模型的高维度通过使用ℓ(1)正则化来控制,以支持简洁的模型。实验结果证明了该方法在提取可与极性和相关性检测相关联的医学断言方面的准确性。