Tortajada Salvador, Robles Montserrat, García-Gómez Juan Miguel
IBIME, Instituto de Aplicaciones de las Tecnologías de la Información y de las Comunicaciones Avanzadas (ITACA), Universitat Politècnica de València, Valencia, Spain,
Methods Mol Biol. 2015;1246:57-78. doi: 10.1007/978-1-4939-1985-7_4.
In the last decades, and following the new trends in medicine, statistical learning techniques have been used for developing automatic diagnostic models for aiding the clinical experts throughout the use of Clinical Decision Support Systems. The development of these models requires a large, representative amount of data, which is commonly obtained from one hospital or a group of hospitals after an expensive and time-consuming gathering, preprocess, and validation of cases. After the model development, it has to overcome an external validation that is often carried out in a different hospital or health center. The experience is that the models show underperformed expectations. Furthermore, patient data needs ethical approval and patient consent to send and store data. For these reasons, we introduce an incremental learning algorithm base on the Bayesian inference approach that may allow us to build an initial model with a smaller number of cases and update it incrementally when new data are collected or even perform a new calibration of a model from a different center by using a reduced number of cases. The performance of our algorithm is demonstrated by employing different benchmark datasets and a real brain tumor dataset; and we compare its performance to a previous incremental algorithm and a non-incremental Bayesian model, showing that the algorithm is independent of the data model, iterative, and has a good convergence.
在过去几十年里,随着医学新趋势的发展,统计学习技术已被用于开发自动诊断模型,以在临床决策支持系统的整个使用过程中协助临床专家。这些模型的开发需要大量具有代表性的数据,这些数据通常是在经过昂贵且耗时的病例收集、预处理和验证后,从一家医院或一组医院获取的。在模型开发之后,它必须克服通常在不同医院或健康中心进行的外部验证。经验表明,这些模型的表现未达预期。此外,患者数据需要伦理批准和患者同意才能发送和存储数据。出于这些原因,我们引入了一种基于贝叶斯推理方法的增量学习算法,该算法可能使我们能够用较少数量的病例构建初始模型,并在收集新数据时对其进行增量更新,甚至可以通过使用较少数量的病例对来自不同中心的模型进行新的校准。通过使用不同的基准数据集和一个真实的脑肿瘤数据集证明了我们算法的性能;并且我们将其性能与之前的增量算法和非增量贝叶斯模型进行了比较,结果表明该算法独立于数据模型、具有迭代性且收敛性良好。