School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, 61005, Republic of Korea.
Artificial Intelligence Graduate School, Gwangju Institute of Science and Technology, Gwangju, 61005, Republic of Korea.
Sci Rep. 2023 Oct 24;13(1):18178. doi: 10.1038/s41598-023-45467-8.
The accurate prediction of patients with complex diseases, such as Alzheimer's disease (AD), as well as disease stages, including early- and late-stage cancer, is challenging owing to substantial variability among patients and limited availability of clinical data. Deep metric learning has emerged as a promising approach for addressing these challenges by improving data representation. In this study, we propose a joint triplet loss model with a semi-hard constraint (JTSC) to represent data in a small number of samples. JTSC strictly selects semi-hard samples by switching anchors and positive samples during the learning process in triplet embedding and combines a triplet loss function with an angular loss function. Our results indicate that JTSC significantly improves the number of appropriately represented samples during training when applied to the gene expression data of AD and to cancer stage prediction tasks. Furthermore, we demonstrate that using an embedding vector from JTSC as an input to the classifiers for AD and cancer stage prediction significantly improves classification performance by extracting more accurate features. In conclusion, we show that feature embedding through JTSC can aid in classification when there are a small number of samples compared to a larger number of features.
由于患者之间存在很大的变异性,并且临床数据有限,因此准确预测复杂疾病(如阿尔茨海默病(AD))以及疾病阶段(包括早期和晚期癌症)具有挑战性。深度度量学习通过改进数据表示,已成为应对这些挑战的一种很有前途的方法。在这项研究中,我们提出了一种联合三重损失模型和半硬约束(JTSC),以在少量样本中表示数据。JTSC 通过在三重嵌入过程中切换锚点和正样本,严格选择半硬样本,并将三重损失函数与角损失函数相结合。我们的结果表明,当应用于 AD 的基因表达数据和癌症阶段预测任务时,JTSC 可以显著提高训练过程中适当表示样本的数量。此外,我们证明,使用 JTSC 的嵌入向量作为 AD 和癌症阶段预测的分类器输入,可以通过提取更准确的特征来显著提高分类性能。总之,我们表明,与更多特征相比,当样本数量较少时,通过 JTSC 进行特征嵌入可以辅助分类。