From the Division of Plastic Surgery, Keck School of Medicine, and Department of Computer Science, University of Southern California; Division of Plastic Surgery, Children's Hospital of Los Angeles; Operation Smile; and Department of Plastic and Reconstructive Surgery, Shriners Hospital for Children.
Plast Reconstr Surg. 2021 Jul 1;148(1):162-169. doi: 10.1097/PRS.0000000000008063.
Despite the wide range of cleft lip morphology, consistent scales to categorize preoperative severity do not exist. Machine learning has been used to increase accuracy and efficiency in detection and rating of multiple conditions, yet it has not been applied to cleft disease. The authors tested a machine learning approach to automatically detect and measure facial landmarks and assign severity grades using preoperative photographs.
Preoperative images were collected from 800 unilateral cleft lip patients, manually annotated for cleft-specific landmarks, and rated using a previously validated severity scale by eight expert reviewers. Five convolutional neural network models were trained for landmark detection and severity grade assignment. Mean squared error loss and Pearson correlation coefficient for cleft width ratio, nostril width ratio, and severity grade assignment were calculated.
All five models performed well in landmark detection and severity grade assignment, with the largest and most complex model, Residual Network, performing best (mean squared error, 24.41; cleft width ratio correlation, 0.943; nostril width ratio correlation, 0.879; severity correlation, 0.892). The mobile device-compatible network, MobileNet, also showed a high degree of accuracy (mean squared error, 36.66; cleft width ratio correlation, 0.901; nostril width ratio correlation, 0.705; severity correlation, 0.860).
Machine learning models demonstrate the ability to accurately measure facial features and assign severity grades according to validated scales. Such models hold promise for the creation of a simple, automated approach to classifying cleft lip morphology. Further potential exists for a mobile telephone-based application to provide real-time feedback to improve clinical decision making and patient counseling.
尽管唇裂形态多样,但目前仍缺乏用于分类术前严重程度的统一标准。机器学习已被用于提高多种疾病的检测和分级的准确性和效率,但尚未应用于唇裂疾病。作者测试了一种机器学习方法,以自动检测和测量面部地标并使用术前照片分配严重程度等级。
从 800 例单侧唇裂患者中收集术前图像,对手动标记的裂特定地标进行注释,并由 8 位专家评审员使用先前验证的严重程度量表进行评分。训练了 5 种卷积神经网络模型用于地标检测和严重程度等级分配。计算了均方误差损失和皮尔逊相关系数,用于裂宽度比、鼻孔宽度比和严重程度等级分配。
所有 5 种模型在地标检测和严重程度等级分配方面表现良好,其中最大和最复杂的模型(残差网络)表现最佳(均方误差,24.41;裂宽度比相关性,0.943;鼻孔宽度比相关性,0.879;严重程度相关性,0.892)。移动设备兼容网络(MobileNet)也显示出高度准确性(均方误差,36.66;裂宽度比相关性,0.901;鼻孔宽度比相关性,0.705;严重程度相关性,0.860)。
机器学习模型能够准确测量面部特征,并根据验证后的量表分配严重程度等级。此类模型有望创建一种简单、自动化的方法来分类唇裂形态。进一步的潜力在于,基于移动电话的应用程序可以提供实时反馈,以改善临床决策和患者咨询。