Suppr超能文献

基于临床专家知识的特征工程:机器学习模型复杂度和性能的案例研究评估。

Feature engineering with clinical expert knowledge: A case study assessment of machine learning model complexity and performance.

机构信息

Johns Hopkins Malone Center for Engineering in Healthcare, Johns Hopkins University, Baltimore, MD, United States of America.

The Institute of Clinical and Translational Research, Johns Hopkins University, Baltimore, MD, United States of America.

出版信息

PLoS One. 2020 Apr 23;15(4):e0231300. doi: 10.1371/journal.pone.0231300. eCollection 2020.

Abstract

Incorporating expert knowledge at the time machine learning models are trained holds promise for producing models that are easier to interpret. The main objectives of this study were to use a feature engineering approach to incorporate clinical expert knowledge prior to applying machine learning techniques, and to assess the impact of the approach on model complexity and performance. Four machine learning models were trained to predict mortality with a severe asthma case study. Experiments to select fewer input features based on a discriminative score showed low to moderate precision for discovering clinically meaningful triplets, indicating that discriminative score alone cannot replace clinical input. When compared to baseline machine learning models, we found a decrease in model complexity with use of fewer features informed by discriminative score and filtering of laboratory features with clinical input. We also found a small difference in performance for the mortality prediction task when comparing baseline ML models to models that used filtered features. Encoding demographic and triplet information in ML models with filtered features appeared to show performance improvements from the baseline. These findings indicated that the use of filtered features may reduce model complexity, and with little impact on performance.

摘要

在训练机器学习模型时纳入专家知识有望生成更易于解释的模型。本研究的主要目的是在应用机器学习技术之前,使用特征工程方法纳入临床专家知识,并评估该方法对模型复杂性和性能的影响。使用机器学习模型对严重哮喘病例进行死亡率预测。基于判别得分选择较少输入特征的实验显示,发现具有临床意义的三重组合的精度较低至中等,这表明判别得分不能替代临床输入。与基线机器学习模型相比,我们发现使用基于判别得分的较少特征和具有临床输入的实验室特征过滤,模型复杂性降低。在比较死亡率预测任务时,我们还发现比较基线 ML 模型和使用过滤特征的模型时,性能略有差异。在具有过滤特征的 ML 模型中编码人口统计学和三重信息似乎显示出从基线提高的性能。这些发现表明,使用过滤特征可以降低模型的复杂性,对性能的影响很小。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7210/7179831/9a450fff34eb/pone.0231300.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验