Lin Ko-Wei, Hsieh Alexander, Farzaneh Seena, Doan Son, Kim Hyeoneui
Division of Biomedical Informatics, University of California, San Diego, La Jolla, CA.
AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:110. eCollection 2013.
This paper describes an information model based approach to standardizing phenotype variables in dbGaP. Our attempt to utilize existing information models of Clinical Element Models (CEM) was not successful although CEM provided a robust means of representing clinical data. Thus, we developed information models derived from phenotype variable descriptions and standardized phenotype variables by fitting them into the models using a simple Natural Language Processing (NLP) algorithm. We report the experience of standardizing findings related variables, which tend to be more idiosyncratic thus pose more challenges to standardization, using this approach.
本文描述了一种基于信息模型的方法,用于在dbGaP中对表型变量进行标准化。尽管临床元素模型(CEM)提供了一种强大的表示临床数据的方法,但我们利用CEM现有信息模型的尝试并未成功。因此,我们从表型变量描述中衍生出信息模型,并通过使用简单的自然语言处理(NLP)算法将表型变量拟合到模型中来对其进行标准化。我们报告了使用这种方法对与发现相关的变量进行标准化的经验,这些变量往往更具特异性,因此给标准化带来了更多挑战。