Verduijn Marion, Sacchi Lucia, Peek Niels, Bellazzi Riccardo, de Jonge Evert, de Mol Bas A J M
Department of Medical Informatics, Academic Medical Center, P.O. Box 22700, 1100 DE Amsterdam, The Netherlands.
Artif Intell Med. 2007 Sep;41(1):1-12. doi: 10.1016/j.artmed.2007.06.003. Epub 2007 Aug 14.
To compare two temporal abstraction procedures for the extraction of meta features from monitoring data. Feature extraction prior to predictive modeling is a common strategy in prediction from temporal data. A fundamental dilemma in this strategy, however, is the extent to which the extraction should be guided by domain knowledge, and to which extent it should be guided by the available data. The two temporal abstraction procedures compared in this case study differ in this respect.
The first temporal abstraction procedure derives symbolic descriptions from the data that are predefined using existing concepts from the medical language. In the second procedure, a large space of numerical meta features is searched through to discover relevant features from the data. These procedures were applied to a prediction problem from intensive care monitoring data. The predictive value of the resulting meta features were compared, and based on each type of features, a class probability tree model was developed.
The numerical meta features extracted by the second procedure were found to be more informative than the symbolic meta features of the first procedure in the case study, and a superior predictive performance was observed for the associated tree model.
The findings indicate that for prediction from monitoring data, induction of numerical meta features from data is preferable to extraction of symbolic meta features using existing clinical concepts.
比较两种从监测数据中提取元特征的时间抽象方法。在从时间数据进行预测时,预测建模前的特征提取是一种常见策略。然而,该策略中的一个基本困境在于提取应在多大程度上由领域知识指导,以及在多大程度上应由可用数据指导。本案例研究中比较的两种时间抽象方法在这方面存在差异。
第一种时间抽象方法从使用医学语言中的现有概念预定义的数据中得出符号描述。在第二种方法中,搜索大量数值元特征空间以从数据中发现相关特征。这些方法被应用于重症监护监测数据的预测问题。比较了所得元特征的预测价值,并基于每种类型的特征开发了一个类概率树模型。
在案例研究中,发现第二种方法提取的数值元特征比第一种方法的符号元特征信息量更大,并且观察到相关树模型具有更好的预测性能。
研究结果表明,对于从监测数据进行预测,从数据中归纳数值元特征比使用现有临床概念提取符号元特征更可取。