Spathis Dimitris, Perez-Pozuelo Ignacio, Marques-Fernandez Laia, Mascolo Cecilia
Department of Computer Science and Technology, University of Cambridge, CB3 0FD Cambridge, UK.
MRC Epidemiology Unit, School of Clinical Medicine, University of Cambridge, CB2 0SL Cambridge, UK.
Patterns (N Y). 2022 Feb 11;3(2):100410. doi: 10.1016/j.patter.2021.100410.
Medicine is undergoing an unprecedented digital transformation, as massive amounts of health data are being produced, gathered, and curated, ranging from in-hospital (e.g., intensive care unit [ICU]) to person-generated data (wearables). Annotating all these data for training purposes in order to feed to deep learning models for pattern recognition is impractical. Here, we discuss some exciting recent results of self-supervised learning (SSL) applications to high-resolution health signals. These examples leverage unlabeled data to learn meaningful representations that can generalize to situations where the ground truth is inadequate or simply infeasible to collect due to the high burden or associated costs. The most prominent bottleneck of deep learning today is access to labeled, carefully curated datasets, and self-supervision on health signals opens up new possibilities to eliminate data silos through general-purpose models that can transfer to low-resource environments and tasks.
医学正在经历前所未有的数字转型,因为大量的健康数据正在被生成、收集和管理,范围从医院内部(例如重症监护病房[ICU])到个人生成的数据(可穿戴设备)。为了训练目的对所有这些数据进行标注,以便输入到深度学习模型中进行模式识别是不切实际的。在这里,我们讨论了自监督学习(SSL)应用于高分辨率健康信号的一些近期令人兴奋的成果。这些例子利用未标注的数据来学习有意义的表示,这些表示可以推广到由于高负担或相关成本而难以获得真实情况或根本无法收集真实情况的场景。当今深度学习最突出的瓶颈是获取有标注的、精心策划的数据集,而对健康信号的自监督通过通用模型为消除数据孤岛开辟了新的可能性,这些通用模型可以转移到低资源环境和任务中。