Suppr超能文献

虚拟患者的计算模拟可减少数据集偏差,并改善基于机器学习从嘈杂的异构重症监护病房数据集中检测急性呼吸窘迫综合征的能力。

Computational Simulation of Virtual Patients Reduces Dataset Bias and Improves Machine Learning-Based Detection of ARDS from Noisy Heterogeneous ICU Datasets.

作者信息

Sharafutdinov Konstantin, Fritsch Sebastian Johannes, Iravani Mina, Ghalati Pejman Farhadi, Saffaran Sina, Bates Declan G, Hardman Jonathan G, Polzin Richard, Mayer Hannah, Marx Gernot, Bickenbach Johannes, Schuppert Andreas

机构信息

Institute for Computational BiomedicineRWTH Aachen University 52062 Aachen Germany.

Joint Research Center for Computational BiomedicineRWTH Aachen University 52062 Aachen Germany.

出版信息

IEEE Open J Eng Med Biol. 2023 Feb 8;5:611-620. doi: 10.1109/OJEMB.2023.3243190. eCollection 2024.

Abstract

Machine learning (ML) technologies that leverage large-scale patient data are promising tools predicting disease evolution in individual patients. However, the limited generalizability of ML models developed on single-center datasets, and their unproven performance in real-world settings, remain significant constraints to their widespread adoption in clinical practice. One approach to tackle this issue is to base learning on large multi-center datasets. However, such heterogeneous datasets can introduce further biases driven by data origin, as data structures and patient cohorts may differ between hospitals. In this paper, we demonstrate how mechanistic virtual patient (VP) modeling can be used to capture specific features of patients' states and dynamics, while reducing biases introduced by heterogeneous datasets. We show how VP modeling can be used for data augmentation through identification of individualized model parameters approximating disease states of patients with suspected acute respiratory distress syndrome (ARDS) from observational data of mixed origin. We compare the results of an unsupervised learning method (clustering) in two cases: where the learning is based on original patient data and on data derived in the matching procedure of the VP model to real patient data. More robust cluster configurations were observed in clustering using the model-derived data. VP model-based clustering also reduced biases introduced by the inclusion of data from different hospitals and was able to discover an additional cluster with significant ARDS enrichment. Our results indicate that mechanistic VP modeling can be used to significantly reduce biases introduced by learning from heterogeneous datasets and to allow improved discovery of patient cohorts driven exclusively by medical conditions.

摘要

利用大规模患者数据的机器学习(ML)技术是预测个体患者疾病演变的有前景的工具。然而,基于单中心数据集开发的ML模型的有限通用性及其在现实环境中未经证实的性能,仍然是其在临床实践中广泛应用的重大限制。解决这个问题的一种方法是基于大型多中心数据集进行学习。然而,这样的异构数据集可能会引入由数据来源驱动的进一步偏差,因为不同医院之间的数据结构和患者队列可能不同。在本文中,我们展示了机械虚拟患者(VP)建模如何用于捕获患者状态和动态的特定特征,同时减少异构数据集引入的偏差。我们展示了如何通过从混合来源的观察数据中识别近似疑似急性呼吸窘迫综合征(ARDS)患者疾病状态的个性化模型参数,将VP建模用于数据增强。我们比较了无监督学习方法(聚类)在两种情况下的结果:学习基于原始患者数据以及基于VP模型与真实患者数据匹配过程中得出的数据。在使用模型得出的数据进行聚类时观察到了更稳健的聚类配置。基于VP模型的聚类还减少了因纳入不同医院的数据而引入的偏差,并且能够发现一个具有显著ARDS富集的额外聚类。我们的结果表明,机械VP建模可用于显著减少从异构数据集中学习所引入的偏差,并允许更好地发现仅由医疗状况驱动的患者队列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/811e/11342939/95e2feb42c3c/shara1-3243190.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验