Suppr超能文献

一种用于识别泛癌中稳健差异甲基化位点的混合集成方法。

A Hybrid Ensemble Approach for Identifying Robust Differentially Methylated Loci in Pan-Cancers.

作者信息

Tian Qi, Zou Jianxiao, Fang Yuan, Yu Zhongli, Tang Jianxiong, Song Ying, Fan Shicai

机构信息

School of Automation Engineering, University of Electronic Science and Technology of China.

Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.

出版信息

Front Genet. 2019 Sep 5;10:774. doi: 10.3389/fgene.2019.00774. eCollection 2019.

Abstract

DNA methylation is a widely investigated epigenetic mark that plays a vital role in tumorigenesis. Advancements in high-throughput assays, such as the Infinium 450K platform, provide genome-scale DNA methylation landscapes in single-CpG locus resolution, and the identification of differentially methylated loci has become an insightful approach to deepen our understanding of cancers. However, the situation with extremely unbalanced numbers of samples and loci (approximately 1:1,000) makes it rather difficult to explore differential methylation between the sick and the normal. In this article, a hybrid approach based on ensemble feature selection for identifying differentially methylated loci (HyDML) was proposed by incorporating instance perturbation and multiple function models. Experiments on data from The Cancer Genome Atlas showed that HyDML not only achieved effective DML identification, but also outperformed the single-feature selection approach in terms of classification performance and the robustness of feature selection. The intensive analysis of the DML indicated that different types of cancers have mutual patterns, and the stable DML sharing in pan-cancers is of the great potential to be biomarkers, which may strengthen the confidence of domain experts to implement biological validations.

摘要

DNA甲基化是一种被广泛研究的表观遗传标记,在肿瘤发生过程中起着至关重要的作用。诸如Infinium 450K平台等高通量检测技术的进步,能够以单CpG位点分辨率提供全基因组规模的DNA甲基化图谱,而识别差异甲基化位点已成为深化我们对癌症理解的一种有见地的方法。然而,样本数量与位点数量极不平衡(约为1:1000)的情况使得探索患病组与正常组之间的差异甲基化变得相当困难。在本文中,通过结合实例扰动和多种功能模型,提出了一种基于集成特征选择来识别差异甲基化位点的混合方法(HyDML)。对来自癌症基因组图谱(The Cancer Genome Atlas)的数据进行的实验表明,HyDML不仅实现了有效的差异甲基化位点识别,而且在分类性能和特征选择的稳健性方面优于单特征选择方法。对差异甲基化位点的深入分析表明,不同类型的癌症具有共同模式,泛癌中共享的稳定差异甲基化位点具有很大的潜力成为生物标志物,这可能会增强领域专家进行生物学验证的信心。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9166/6739624/113cef7fc5ff/fgene-10-00774-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验