IEEE Trans Vis Comput Graph. 2023 Aug;29(8):3617-3629. doi: 10.1109/TVCG.2022.3166159. Epub 2023 Jun 29.
In this article, we present OrthoAligner, a novel method to predict the visual outcome of orthodontic treatment in a portrait image. Unlike the state-of-the-art method, which relies on a 3D teeth model obtained from dental scanning, our method generates realistic alignment effects in images without requiring additional 3D information as input and thus making our system readily available to average users. The key of our approach is to employ the 3D geometric information encoded in an unsupervised generative model, i.e., StyleGAN in this article. Instead of directly conducting translation in the image space, we embed the teeth region extracted from a given portrait to the latent space of the StyleGAN generator and propose a novel latent editing method to discover a geometrically meaningful editing path that yields the alignment process in the image space. To blend the edited mouth region with the original portrait image, we further introduce a BlendingNet to remove boundary artifacts and correct color inconsistency. We also extend our method to short video clips by propagating the alignment effects across neighboring frames. We evaluate our method in various orthodontic cases, compare it to the state-of-the-art and competitive baselines, and validate the effectiveness of each component.
在本文中,我们提出了一种新的方法 OrthoAligner,用于预测肖像图像中正畸治疗的视觉效果。与依赖于从牙齿扫描获得的 3D 牙齿模型的最新方法不同,我们的方法无需额外的 3D 信息作为输入即可在图像中生成逼真的对齐效果,从而使我们的系统易于为普通用户使用。我们的方法的关键是利用无监督生成模型(本文中为 StyleGAN)中编码的 3D 几何信息。我们不是直接在图像空间中进行转换,而是将从给定肖像中提取的牙齿区域嵌入到 StyleGAN 生成器的潜在空间中,并提出一种新的潜在编辑方法来发现具有几何意义的编辑路径,从而在图像空间中产生对齐过程。为了将编辑后的嘴部区域与原始肖像图像融合,我们进一步引入了一个 BlendingNet 来消除边界伪影并纠正颜色不一致。我们还通过在相邻帧之间传播对齐效果,将我们的方法扩展到短视频剪辑中。我们在各种正畸案例中评估了我们的方法,将其与最新方法和竞争基线进行了比较,并验证了每个组件的有效性。