College of Information and Computer, Taiyuan University of Technology, Taiyuan, Shanxi, China.
Department of Clinical Laboratory, Affiliated People's Hospital of Shanxi Medical University, Shanxi Provincial People's Hospital, Taiyuan, Shanxi, China.
Sci Rep. 2023 Mar 30;13(1):5167. doi: 10.1038/s41598-023-32301-4.
Aiming at the problems of long time, high cost, invasive sampling damage, and easy emergence of drug resistance in lung cancer gene detection, a reliable and non-invasive prognostic method is proposed. Under the guidance of weakly supervised learning, deep metric learning and graph clustering methods are used to learn higher-level abstract features in CT imaging features. The unlabeled data is dynamically updated through the k-nearest label update strategy, and the unlabeled data is transformed into weak label data and continue to update the process of strong label data to optimize the clustering results and establish a classification model for predicting new subtypes of lung cancer imaging. Five imaging subtypes are confirmed on the lung cancer dataset containing CT, clinical and genetic information downloaded from the TCIA lung cancer database. The successful establishment of the new model has a significant accuracy rate for subtype classification (ACC = 0.9793), and the use of CT sequence images, gene expression, DNA methylation and gene mutation data from the cooperative hospital in Shanxi Province proves the biomedical value of this method. The proposed method also can comprehensively evaluate intratumoral heterogeneity based on the correlation between the final lung CT imaging features and specific molecular subtypes.
针对肺癌基因检测中时间长、成本高、有创采样损伤以及容易出现耐药性等问题,提出了一种可靠的非侵入性预后方法。在弱监督学习的指导下,利用深度度量学习和图聚类方法从 CT 成像特征中学习更高层次的抽象特征。通过 k-最近邻标签更新策略对未标记数据进行动态更新,将未标记数据转换为弱标记数据,并继续更新强标记数据的过程,以优化聚类结果,建立用于预测肺癌成像新亚型的分类模型。在从 TCIA 肺癌数据库下载的包含 CT、临床和遗传信息的肺癌数据集上确认了五种成像亚型。新模型的成功建立对亚型分类具有显著的准确率(ACC=0.9793),并且使用来自山西省合作医院的 CT 序列图像、基因表达、DNA 甲基化和基因突变数据证明了该方法的生物医学价值。该方法还可以基于最终的肺部 CT 成像特征与特定分子亚型之间的相关性来综合评估肿瘤内异质性。