Suppr超能文献

meth-SemiCancer:一种基于半监督学习利用 DNA 甲基化谱进行癌症亚型分类的框架。

meth-SemiCancer: a cancer subtype classification framework via semi-supervised learning utilizing DNA methylation profiles.

机构信息

Department of Computer Science, Virginia Tech, Blacksburg, USA.

Division of Computer Science, Sookmyung Women's University, Seoul, Republic of Korea.

出版信息

BMC Bioinformatics. 2023 Apr 26;24(1):168. doi: 10.1186/s12859-023-05272-6.

Abstract

BACKGROUND

Identification of the cancer subtype plays a crucial role to provide an accurate diagnosis and proper treatment to improve the clinical outcomes of patients. Recent studies have shown that DNA methylation is one of the key factors for tumorigenesis and tumor growth, where the DNA methylation signatures have the potential to be utilized as cancer subtype-specific markers. However, due to the high dimensionality and the low number of DNA methylome cancer samples with the subtype information, still, to date, a cancer subtype classification method utilizing DNA methylome datasets has not been proposed.

RESULTS

In this paper, we present meth-SemiCancer, a semi-supervised cancer subtype classification framework based on DNA methylation profiles. The proposed model was first pre-trained based on the methylation datasets with the cancer subtype labels. After that, meth-SemiCancer generated the pseudo-subtypes for the cancer datasets without subtype information based on the model's prediction. Finally, fine-tuning was performed utilizing both the labeled and unlabeled datasets.

CONCLUSIONS

From the performance comparison with the standard machine learning-based classifiers, meth-SemiCancer achieved the highest average F1-score and Matthews correlation coefficient, outperforming other methods. Fine-tuning the model with the unlabeled patient samples by providing the proper pseudo-subtypes, encouraged meth-SemiCancer to generalize better than the supervised neural network-based subtype classification method. meth-SemiCancer is publicly available at https://github.com/cbi-bioinfo/meth-SemiCancer .

摘要

背景

鉴定癌症亚型对于提供准确的诊断和适当的治疗以改善患者的临床结果至关重要。最近的研究表明,DNA 甲基化是肿瘤发生和肿瘤生长的关键因素之一,其中 DNA 甲基化特征有可能被用作癌症亚型特异性标志物。然而,由于具有亚型信息的 DNA 甲基化癌症样本数量较少且维度较高,迄今为止,尚未提出利用 DNA 甲基化数据集的癌症亚型分类方法。

结果

在本文中,我们提出了 meth-SemiCancer,这是一种基于 DNA 甲基化谱的半监督癌症亚型分类框架。该模型首先基于具有癌症亚型标签的甲基化数据集进行预训练。之后,meth-SemiCancer 根据模型的预测为没有亚型信息的癌症数据集生成伪亚型。最后,使用标记和未标记的数据集进行微调。

结论

与基于标准机器学习的分类器的性能比较表明,meth-SemiCancer 实现了最高的平均 F1 分数和马修斯相关系数,优于其他方法。通过提供适当的伪亚型对未标记的患者样本进行模型微调,鼓励 meth-SemiCancer 比基于监督神经网络的亚型分类方法更好地泛化。meth-SemiCancer 可在 https://github.com/cbi-bioinfo/meth-SemiCancer 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c446/10131478/43bb21958239/12859_2023_5272_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验