Suppr超能文献

通过深度卷积神经网络自动定位口腔面部标志,以分析麻醉状态下的气道形态。

Automated location of orofacial landmarks to characterize airway morphology in anaesthesia via deep convolutional neural networks.

机构信息

Basque Center for Applied Mathematics (BCAM) - Bilbao, Basque Country, Spain.

Basque Center for Applied Mathematics (BCAM) - Bilbao, Basque Country, Spain; IE University, School of Science and Technology - Madrid, Madrid, Spain.

出版信息

Comput Methods Programs Biomed. 2023 Apr;232:107428. doi: 10.1016/j.cmpb.2023.107428. Epub 2023 Feb 25.

Abstract

BACKGROUND

A reliable anticipation of a difficult airway may notably enhance safety during anaesthesia. In current practice, clinicians use bedside screenings by manual measurements of patients' morphology.

OBJECTIVE

To develop and evaluate algorithms for the automated extraction of orofacial landmarks, which characterize airway morphology.

METHODS

We defined 27 frontal + 13 lateral landmarks. We collected n=317 pairs of pre-surgery photos from patients undergoing general anaesthesia (140 females, 177 males). As ground truth reference for supervised learning, landmarks were independently annotated by two anaesthesiologists. We trained two ad-hoc deep convolutional neural network architectures based on InceptionResNetV2 (IRNet) and MobileNetV2 (MNet), to predict simultaneously: (a) whether each landmark is visible or not (occluded, out of frame), (b) its 2D-coordinates (x,y). We implemented successive stages of transfer learning, combined with data augmentation. We added custom top layers on top of these networks, whose weights were fully tuned for our application. Performance in landmark extraction was evaluated by 10-fold cross-validation (CV) and compared against 5 state-of-the-art deformable models.

RESULTS

With annotators' consensus as the 'gold standard', our IRNet-based network performed comparably to humans in the frontal view: median CV loss L=1.277·10, inter-quartile range (IQR) [1.001, 1.660]; versus median 1.360, IQR [1.172, 1.651], and median 1.352, IQR [1.172, 1.619], for each annotator against consensus, respectively. MNet yielded slightly worse results: median 1.471, IQR [1.139, 1.982]. In the lateral view, both networks attained performances statistically poorer than humans: median CV loss L=2.141·10, IQR [1.676, 2.915], and median 2.611, IQR [1.898, 3.535], respectively; versus median 1.507, IQR [1.188, 1.988], and median 1.442, IQR [1.147, 2.010] for both annotators. However, standardized effect sizes in CV loss were small: 0.0322 and 0.0235 (non-significant) for IRNet, 0.1431 and 0.1518 (p<0.05) for MNet; therefore quantitatively similar to humans. The best performing state-of-the-art model (a deformable regularized Supervised Descent Method, SDM) behaved comparably to our DCNNs in the frontal scenario, but notoriously worse in the lateral view.

CONCLUSIONS

We successfully trained two DCNN models for the recognition of 27 + 13 orofacial landmarks pertaining to the airway. Using transfer learning and data augmentation, they were able to generalize without overfitting, reaching expert-like performances in CV. Our IRNet-based methodology achieved a satisfactory identification and location of landmarks: particularly in the frontal view, at the level of anaesthesiologists. In the lateral view, its performance decayed, although with a non-significant effect size. Independent authors had also reported lower lateral performances; as certain landmarks may not be clear salient points, even for a trained human eye.

摘要

背景

可靠地预测困难气道可以显著提高麻醉期间的安全性。在当前的实践中,临床医生使用床边筛查通过对患者形态的手动测量来进行。

目的

开发和评估用于自动提取口面标志的算法,这些标志用于描述气道形态。

方法

我们定义了 27 个正面+13 个侧面标志。我们从接受全身麻醉的患者中收集了 n=317 对术前照片(女性 140 例,男性 177 例)。作为监督学习的地面实况参考,标志由两名麻醉师独立进行注释。我们基于 InceptionResNetV2(IRNet)和 MobileNetV2(MNet)训练了两个特定用途的深度卷积神经网络架构,以同时预测:(a)每个标志是否可见或不可见(遮挡,超出框架),(b)其 2D 坐标(x,y)。我们实施了连续的转移学习阶段,结合了数据增强。我们在这些网络之上添加了自定义的顶层,其权重完全针对我们的应用程序进行了调整。通过 10 折交叉验证(CV)评估了地标提取的性能,并与 5 种最先进的可变形模型进行了比较。

结果

以注释者的共识作为“黄金标准”,我们的基于 IRNet 的网络在正面视图中的表现与人类相当:中位数 CV 损失 L=1.277·10,四分位距(IQR)[1.001,1.660];分别为每个注释者相对于共识的中位数 1.360,IQR [1.172,1.651]和中位数 1.352,IQR [1.172,1.619]。MNet 的结果略差:中位数 1.471,IQR [1.139,1.982]。在侧面视图中,两个网络的表现均明显不如人类:中位数 CV 损失 L=2.141·10,IQR [1.676,2.915];分别为中位数 2.611,IQR [1.898,3.535]和中位数 1.507,IQR [1.188,1.988]。然而,CV 损失中的标准化效应大小很小:IRNet 为 0.0322 和 0.0235(不显著),MNet 为 0.1431 和 0.1518(p<0.05);因此与人类相似。表现最好的最先进模型(可变形正则化的监督下降方法,SDM)在正面场景中的表现与我们的 DCNN 相似,但在侧面视图中表现明显较差。

结论

我们成功训练了两个用于识别与气道相关的 27+13 个口面标志的 DCNN 模型。通过使用迁移学习和数据增强,它们能够在不发生过拟合的情况下进行泛化,在 CV 中达到了专家级的表现。我们基于 IRNet 的方法实现了地标识别和定位的令人满意的结果:特别是在正面视图中,与麻醉师的水平相当。在侧面视图中,其性能下降,尽管效果大小不显著。独立作者也报告了较低的侧面性能;因为某些地标可能不是明显的显著点,即使对于受过训练的人眼也是如此。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验