Suppr超能文献

重症大数据时代的现代学习:首要原则是不伤害。

Modern Learning from Big Data in Critical Care: Primum Non Nocere.

机构信息

Department of Public Health, Erasmus University Medical Center, Doctor Molewaterplein 40, 3015 GD, Rotterdam, Netherlands.

Department of Biomedical Data Sciences, Leiden University Medical Center, Leiden, Netherlands.

出版信息

Neurocrit Care. 2022 Aug;37(Suppl 2):174-184. doi: 10.1007/s12028-022-01510-6. Epub 2022 May 5.

Abstract

Large and complex data sets are increasingly available for research in critical care. To analyze these data, researchers use techniques commonly referred to as statistical learning or machine learning (ML). The latter is known for large successes in the field of diagnostics, for example, by identification of radiological anomalies. In other research areas, such as clustering and prediction studies, there is more discussion regarding the benefit and efficiency of ML techniques compared with statistical learning. In this viewpoint, we aim to explain commonly used statistical learning and ML techniques and provide guidance for responsible use in the case of clustering and prediction questions in critical care. Clustering studies have been increasingly popular in critical care research, aiming to inform how patients can be characterized, classified, or treated differently. An important challenge for clustering studies is to ensure and assess generalizability. This limits the application of findings in these studies toward individual patients. In the case of predictive questions, there is much discussion as to what algorithm should be used to most accurately predict outcome. Aspects that determine usefulness of ML, compared with statistical techniques, include the volume of the data, the dimensionality of the preferred model, and the extent of missing data. There are areas in which modern ML methods may be preferred. However, efforts should be made to implement statistical frameworks (e.g., for dealing with missing data or measurement error, both omnipresent in clinical data) in ML methods. To conclude, there are important opportunities but also pitfalls to consider when performing clustering or predictive studies with ML techniques. We advocate careful valuation of new data-driven findings. More interaction is needed between the engineer mindset of experts in ML methods, the insight in bias of epidemiologists, and the probabilistic thinking of statisticians to extract as much information and knowledge from data as possible, while avoiding harm.

摘要

大型和复杂的数据集越来越多地可用于重症监护的研究。为了分析这些数据,研究人员使用通常称为统计学习或机器学习(ML)的技术。后者在诊断领域取得了巨大成功,例如,通过识别放射学异常。在其他研究领域,例如聚类和预测研究,与统计学习相比,人们对 ML 技术的益处和效率进行了更多的讨论。在本观点中,我们旨在解释常用的统计学习和 ML 技术,并为在重症监护的聚类和预测问题中负责任地使用这些技术提供指导。聚类研究在重症监护研究中越来越受欢迎,旨在告知如何对患者进行特征描述、分类或进行不同的治疗。聚类研究的一个重要挑战是确保和评估可推广性。这限制了这些研究中的发现在个体患者中的应用。在预测问题中,有很多讨论是关于应该使用哪种算法来最准确地预测结果。与统计技术相比,决定 ML 有用性的方面包括数据量、首选模型的维度以及缺失数据的程度。在某些领域,现代 ML 方法可能更受欢迎。然而,应该努力在 ML 方法中实现统计框架(例如,用于处理缺失数据或测量误差,这两者在临床数据中普遍存在)。总之,在使用 ML 技术进行聚类或预测研究时,有重要的机会,但也存在需要考虑的陷阱。我们提倡仔细评估新的基于数据的发现。需要更多地在 ML 方法专家的工程师思维、流行病学专家对偏差的洞察力以及统计学家的概率思维之间进行互动,以尽可能从数据中提取更多的信息和知识,同时避免造成伤害。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c210/9343283/4255f8fdb635/12028_2022_1510_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验