Suppr超能文献

meth-SemiCancer:一种基于半监督学习利用 DNA 甲基化谱进行癌症亚型分类的框架。

meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles.

机构信息

Department of Computer Science, Virginia Tech, Blacksburg, USA.

Division of Computer Science, Sookmyung Women's University, Seoul, Republic of Korea.

出版信息

BMC Bioinformatics. 2023 Apr 26;24(1):168. doi: 10.1186/s12859-023-05272-6.

Abstract

BACKGROUND

Identification of the cancer subtype plays a crucial role to provide an accurate diagnosis and proper treatment to improve the clinical outcomes of patients. Recent studies have shown that DNA methylation is one of the key factors for tumorigenesis and tumor growth, where the DNA methylation signatures have the potential to be utilized as cancer subtype-specific markers. However, due to the high dimensionality and the low number of DNA methylome cancer samples with the subtype information, still, to date, a cancer subtype classification method utilizing DNA methylome datasets has not been proposed.

RESULTS

In this paper, we present meth-SemiCancer, a semi-supervised cancer subtype classification framework based on DNA methylation profiles. The proposed model was first pre-trained based on the methylation datasets with the cancer subtype labels. After that, meth-SemiCancer generated the pseudo-subtypes for the cancer datasets without subtype information based on the model's prediction. Finally, fine-tuning was performed utilizing both the labeled and unlabeled datasets.

CONCLUSIONS

From the performance comparison with the standard machine learning-based classifiers, meth-SemiCancer achieved the highest average F1-score and Matthews correlation coefficient, outperforming other methods. Fine-tuning the model with the unlabeled patient samples by providing the proper pseudo-subtypes, encouraged meth-SemiCancer to generalize better than the supervised neural network-based subtype classification method. meth-SemiCancer is publicly available at https://github.com/cbi-bioinfo/meth-SemiCancer .

摘要

背景

鉴定癌症亚型对于提供准确的诊断和适当的治疗以改善患者的临床结果至关重要。最近的研究表明,DNA 甲基化是肿瘤发生和肿瘤生长的关键因素之一,其中 DNA 甲基化特征有可能被用作癌症亚型特异性标志物。然而,由于具有亚型信息的 DNA 甲基化癌症样本数量较少且维度较高,迄今为止,尚未提出利用 DNA 甲基化数据集的癌症亚型分类方法。

结果

在本文中,我们提出了 meth-SemiCancer,这是一种基于 DNA 甲基化谱的半监督癌症亚型分类框架。该模型首先基于具有癌症亚型标签的甲基化数据集进行预训练。之后,meth-SemiCancer 根据模型的预测为没有亚型信息的癌症数据集生成伪亚型。最后,使用标记和未标记的数据集进行微调。

结论

与基于标准机器学习的分类器的性能比较表明,meth-SemiCancer 实现了最高的平均 F1 分数和马修斯相关系数,优于其他方法。通过提供适当的伪亚型对未标记的患者样本进行模型微调,鼓励 meth-SemiCancer 比基于监督神经网络的亚型分类方法更好地泛化。meth-SemiCancer 可在 https://github.com/cbi-bioinfo/meth-SemiCancer 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c446/10131478/43bb21958239/12859_2023_5272_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验