Department of Biomedical Informatics, University of Utah, Suite 140, 421 Wakara Way, Salt Lake City, UT 84108 USA.
Health Inf Sci Syst. 2015 Sep 28;3:3. doi: 10.1186/s13755-015-0011-0. eCollection 2015.
Predictive modeling is fundamental for extracting value from large clinical data sets, or "big clinical data," advancing clinical research, and improving healthcare. Machine learning is a powerful approach to predictive modeling. Two factors make machine learning challenging for healthcare researchers. First, before training a machine learning model, the values of one or more model parameters called hyper-parameters must typically be specified. Due to their inexperience with machine learning, it is hard for healthcare researchers to choose an appropriate algorithm and hyper-parameter values. Second, many clinical data are stored in a special format. These data must be iteratively transformed into the relational table format before conducting predictive modeling. This transformation is time-consuming and requires computing expertise.
This paper presents our vision for and design of MLBCD (Machine Learning for Big Clinical Data), a new software system aiming to address these challenges and facilitate building machine learning predictive models using big clinical data.
The paper describes MLBCD's design in detail.
By making machine learning accessible to healthcare researchers, MLBCD will open the use of big clinical data and increase the ability to foster biomedical discovery and improve care.
预测建模对于从大型临床数据集(即“大数据”)中提取价值、推进临床研究和改善医疗保健至关重要。机器学习是一种强大的预测建模方法。有两个因素使得机器学习对医疗保健研究人员具有挑战性。首先,在训练机器学习模型之前,通常必须指定一个或多个称为超参数的模型参数的值。由于他们缺乏机器学习方面的经验,医疗保健研究人员很难选择合适的算法和超参数值。其次,许多临床数据以特殊格式存储。在进行预测建模之前,这些数据必须迭代转换为关系表格式。这种转换既耗时又需要计算专业知识。
本文介绍了我们对 MLBCD(用于大型临床数据的机器学习)的愿景和设计,这是一个新的软件系统,旨在解决这些挑战,并促进使用大型临床数据构建机器学习预测模型。
本文详细描述了 MLBCD 的设计。
通过使机器学习易于医疗保健研究人员使用,MLBCD 将开放使用大型临床数据,并提高促进生物医学发现和改善护理的能力。