Wang Weinan, Mohseni Pedram, Kilgore Kevin L, Najafizadeh Laleh
Integrated Systems and NeuroImaging Laboratory, Department of Electrical and Computer Engineering, Rutgers University, Piscataway, NJ, United States.
Department of Electrical, Computer, and Systems Engineering, Case Western Reserve University, Cleveland, OH, United States.
Front Digit Health. 2023 Feb 8;4:1090854. doi: 10.3389/fdgth.2022.1090854. eCollection 2022.
There has been a growing interest in developing cuff-less blood pressure (BP) estimation methods to enable continuous BP monitoring from electrocardiogram (ECG) and/or photoplethysmogram (PPG) signals. The majority of these methods have been evaluated using publicly-available datasets, however, there exist significant discrepancies across studies with respect to the size, the number of subjects, and the applied pre-processing steps for the data that is eventually used for training and testing the models. Such differences make conducting performance comparison across models largely unfair, and mask the generalization capability of various BP estimation methods. To fill this important gap, this paper presents "PulseDB," the largest cleaned dataset to date, for benchmarking BP estimation models that also fulfills the requirements of standardized testing protocols. PulseDB contains 1) 5,245,454 high-quality -s segments of ECG, PPG, and arterial BP (ABP) waveforms from 5,361 subjects retrieved from the MIMIC-III waveform database matched subset and the VitalDB database; 2) subjects' identification and demographic information, that can be utilized as additional input features to improve the performance of BP estimation models, or to evaluate the generalizability of the models to data from unseen subjects; and 3) positions of the characteristic points of the ECG/PPG signals, making PulseDB directly usable for training deep learning models with minimal data pre-processing. Additionally, using this dataset, we conduct the first study to provide insights about the performance gap between calibration-based and calibration-free testing approaches for evaluating generalizability of the BP estimation models. We expect PulseDB, as a user-friendly, large, comprehensive and multi-functional dataset, to be used as a reliable source for the evaluation of cuff-less BP estimation methods.
人们对开发无袖带血压(BP)估计方法的兴趣日益浓厚,以便能够从心电图(ECG)和/或光电容积脉搏波描记图(PPG)信号中进行连续血压监测。这些方法大多已使用公开可用的数据集进行评估,然而,在最终用于训练和测试模型的数据的大小、受试者数量以及应用的预处理步骤方面,各研究之间存在显著差异。这些差异使得跨模型进行性能比较在很大程度上不公平,并掩盖了各种血压估计方法的泛化能力。为了填补这一重要空白,本文提出了“PulseDB”,这是迄今为止最大的经过清理的数据集,用于对血压估计模型进行基准测试,该数据集也满足标准化测试协议的要求。PulseDB包含:1)从MIMIC - III波形数据库匹配子集和VitalDB数据库中检索到的来自5361名受试者的5245454个高质量的ECG、PPG和动脉血压(ABP)波形段;2)受试者的识别和人口统计信息,可将其用作额外的输入特征,以提高血压估计模型的性能,或评估模型对来自未见过的受试者的数据的泛化能力;3)ECG/PPG信号特征点的位置,使得PulseDB在进行最少的数据预处理后即可直接用于训练深度学习模型。此外,使用该数据集,我们进行了第一项研究,以深入了解基于校准和无校准测试方法在评估血压估计模型泛化能力方面的性能差距。我们期望PulseDB作为一个用户友好、大型、全面且多功能的数据集,能够用作评估无袖带血压估计方法的可靠来源。