Albuquerque Tomé, Cruz Ricardo, Cardoso Jaime S
Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal.
Faculty of Engineering of the University of Porto, Porto, Portugal.
PeerJ Comput Sci. 2021 Apr 23;7:e457. doi: 10.7717/peerj-cs.457. eCollection 2021.
Cervical cancer is the fourth leading cause of cancer-related deaths in women, especially in low to middle-income countries. Despite the outburst of recent scientific advances, there is no totally effective treatment, especially when diagnosed in an advanced stage. Screening tests, such as cytology or colposcopy, have been responsible for a substantial decrease in cervical cancer deaths. Cervical cancer automatic screening via Pap smear is a highly valuable cell imaging-based detection tool, where cells must be classified as being within one of a multitude of ordinal classes, ranging from abnormal to normal. Current approaches to ordinal inference for neural networks are found to not sufficiently take advantage of the ordinal problem or to be too uncompromising. A non-parametric ordinal loss for neuronal networks is proposed that promotes the output probabilities to follow a unimodal distribution. This is done by imposing a set of different constraints over all pairs of consecutive labels which allows for a more flexible decision boundary relative to approaches from the literature. Our proposed loss is contrasted against other methods from the literature by using a plethora of deep architectures. A first conclusion is the benefit of using non-parametric ordinal losses against parametric losses in cervical cancer risk prediction. Additionally, the proposed loss is found to be the top-performer in several cases. The best performing model scores an accuracy of 75.6% for seven classes and 81.3% for four classes.
宫颈癌是女性癌症相关死亡的第四大主要原因,在低收入和中等收入国家尤为如此。尽管最近科学取得了突飞猛进的发展,但仍没有完全有效的治疗方法,尤其是在晚期被诊断出时。细胞学或阴道镜检查等筛查测试已使宫颈癌死亡人数大幅下降。通过巴氏涂片进行宫颈癌自动筛查是一种基于细胞成像的高价值检测工具,其中细胞必须被分类为众多有序类别之一,范围从异常到正常。目前用于神经网络的有序推理方法被发现没有充分利用有序问题,或者过于苛刻。提出了一种用于神经网络的非参数有序损失,它促使输出概率遵循单峰分布。这是通过对所有连续标签对施加一组不同的约束来实现的,相对于文献中的方法,这允许更灵活的决策边界。我们提出的损失通过使用大量深度架构与文献中的其他方法进行了对比。第一个结论是在宫颈癌风险预测中使用非参数有序损失相对于参数损失的好处。此外,发现所提出的损失在几种情况下是表现最佳的。表现最佳的模型在七类情况下的准确率为75.6%,在四类情况下的准确率为81.3%。