Ajirak Marzieh, Heiselman Cassandra, Quirk J Gerald, Djurić Petar M
Department of Electrical and Computer Engineering, Stony Brook University, Stony Brook, NY 11794, USA.
Department of Obstetrics, Gynecology and Reproductive Medicine, Stony Brook University, Stony Brook, NY 11794, USA.
Proc IEEE Int Conf Acoust Speech Signal Process. 2022 May;2022:1316-1320. doi: 10.1109/icassp43922.2022.9746503. Epub 2022 Apr 27.
During the process of childbirth, fetal distress caused by hypoxia can lead to various abnormalities. Cardiotocography (CTG), which consists of continuous recording of the fetal heart rate (FHR) and uterine contractions (UC), is routinely used for classifying the fetuses as hypoxic or non-hypoxic. In practice, we face highly imbalanced data, where the hypoxic fetuses are significantly underrepresented. We propose to address this problem by boost ensemble learning, where for learning, we use the distribution of classification error over the dataset. We then iteratively select the most informative majority data samples according to this distribution. In our work, in addition to addressing the imbalanced problem, we also experimented with features that are not commonly used in obstetrics. We extracted a large number of statistical features of fetal heart tracings and uterine activity signals and used only the most informative ones. For classification, we implemented several methods: Random Forest, AdaBoost, -Nearest Neighbors, Support Vector Machine, and Decision Trees. The paper provides a comparison in the performance of these methods on fetal heart rate tracings available from a public database. Our results show that most applied methods improved their performances considerably when boost ensemble was used.
在分娩过程中,缺氧引起的胎儿窘迫会导致各种异常情况。胎心监护(CTG),包括对胎儿心率(FHR)和子宫收缩(UC)进行连续记录,通常用于将胎儿分类为缺氧或非缺氧。在实际操作中,我们面临着高度不平衡的数据,其中缺氧胎儿的数量明显较少。我们建议通过提升集成学习来解决这个问题,即在学习过程中,我们使用数据集上的分类误差分布。然后根据这个分布迭代地选择最具信息性的多数数据样本。在我们的工作中,除了解决不平衡问题,我们还对产科中不常用的特征进行了实验。我们提取了大量胎儿心率描记图和子宫活动信号的统计特征,并只使用最具信息性的特征。对于分类,我们实现了几种方法:随机森林、AdaBoost、K近邻、支持向量机和决策树。本文对这些方法在一个公共数据库提供的胎儿心率描记图上的性能进行了比较。我们的结果表明,当使用提升集成时,大多数应用方法的性能都有了显著提高。