Bellazzi Riccardo, Zupan Blaz
Dipartimento di Informatica e Sistemistica, Università di Pavia, via Ferrata 1, 27100 Pavia, Italy.
Int J Med Inform. 2008 Feb;77(2):81-97. doi: 10.1016/j.ijmedinf.2006.11.006. Epub 2006 Dec 26.
The widespread availability of new computational methods and tools for data analysis and predictive modeling requires medical informatics researchers and practitioners to systematically select the most appropriate strategy to cope with clinical prediction problems. In particular, the collection of methods known as 'data mining' offers methodological and technical solutions to deal with the analysis of medical data and construction of prediction models. A large variety of these methods requires general and simple guidelines that may help practitioners in the appropriate selection of data mining tools, construction and validation of predictive models, along with the dissemination of predictive models within clinical environments.
The goal of this review is to discuss the extent and role of the research area of predictive data mining and to propose a framework to cope with the problems of constructing, assessing and exploiting data mining models in clinical medicine.
We review the recent relevant work published in the area of predictive data mining in clinical medicine, highlighting critical issues and summarizing the approaches in a set of learned lessons.
The paper provides a comprehensive review of the state of the art of predictive data mining in clinical medicine and gives guidelines to carry out data mining studies in this field.
Predictive data mining is becoming an essential instrument for researchers and clinical practitioners in medicine. Understanding the main issues underlying these methods and the application of agreed and standardized procedures is mandatory for their deployment and the dissemination of results. Thanks to the integration of molecular and clinical data taking place within genomic medicine, the area has recently not only gained a fresh impulse but also a new set of complex problems it needs to address.
用于数据分析和预测建模的新计算方法和工具广泛可得,这就要求医学信息学研究人员和从业者系统地选择最合适的策略来应对临床预测问题。特别是,被称为“数据挖掘”的一系列方法为处理医学数据分析和预测模型构建提供了方法和技术解决方案。这些方法种类繁多,需要通用且简单的指南,以帮助从业者适当选择数据挖掘工具、构建和验证预测模型,以及在临床环境中传播预测模型。
本综述的目的是讨论预测性数据挖掘研究领域的范围和作用,并提出一个框架来应对临床医学中构建、评估和应用数据挖掘模型的问题。
我们回顾了临床医学中预测性数据挖掘领域最近发表的相关工作,突出关键问题,并在一系列经验教训中总结方法。
本文全面综述了临床医学中预测性数据挖掘的现状,并给出了在该领域开展数据挖掘研究的指南。
预测性数据挖掘正成为医学研究人员和临床从业者的重要工具。理解这些方法背后的主要问题以及采用一致和标准化程序对于其应用和结果传播至关重要。由于基因组医学中分子数据和临床数据的整合,该领域最近不仅获得了新的推动力,还面临着一系列需要解决的新的复杂问题。