Suppr超能文献

基于多组学数据预测卵巢癌生存的最小冗余最大相关性多视图特征选择。

Min-redundancy and max-relevance multi-view feature selection for predicting ovarian cancer survival using multi-omics data.

机构信息

Artificial Intelligence Research Laboratory, College of Information Sciences and Technology, Pennsylvania State University, University Park, PA, 16802, USA.

The Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA, 16802, USA.

出版信息

BMC Med Genomics. 2018 Sep 14;11(Suppl 3):71. doi: 10.1186/s12920-018-0388-0.

Abstract

BACKGROUND

Large-scale collaborative precision medicine initiatives (e.g., The Cancer Genome Atlas (TCGA)) are yielding rich multi-omics data. Integrative analyses of the resulting multi-omics data, such as somatic mutation, copy number alteration (CNA), DNA methylation, miRNA, gene expression, and protein expression, offer tantalizing possibilities for realizing the promise and potential of precision medicine in cancer prevention, diagnosis, and treatment by substantially improving our understanding of underlying mechanisms as well as the discovery of novel biomarkers for different types of cancers. However, such analyses present a number of challenges, including heterogeneity, and high-dimensionality of omics data.

METHODS

We propose a novel framework for multi-omics data integration using multi-view feature selection. We introduce a novel multi-view feature selection algorithm, MRMR-mv, an adaptation of the well-known Min-Redundancy and Maximum-Relevance (MRMR) single-view feature selection algorithm to the multi-view setting.

RESULTS

We report results of experiments using an ovarian cancer multi-omics dataset derived from the TCGA database on the task of predicting ovarian cancer survival. Our results suggest that multi-view models outperform both view-specific models (i.e., models trained and tested using a single type of omics data) and models based on two baseline data fusion methods.

CONCLUSIONS

Our results demonstrate the potential of multi-view feature selection in integrative analyses and predictive modeling from multi-omics data.

摘要

背景

大规模协作的精准医学计划(例如癌症基因组图谱(TCGA))正在产生丰富的多组学数据。对由此产生的多组学数据进行综合分析,如体细胞突变、拷贝数改变(CNA)、DNA 甲基化、miRNA、基因表达和蛋白质表达,通过大大提高我们对潜在机制的理解以及发现不同类型癌症的新型生物标志物,为实现精准医学在癌症预防、诊断和治疗中的承诺和潜力提供了诱人的可能性。然而,此类分析存在许多挑战,包括组学数据的异质性和高维性。

方法

我们提出了一种使用多视图特征选择进行多组学数据集成的新框架。我们引入了一种新颖的多视图特征选择算法,MRMR-mv,这是一种对著名的最小冗余和最大相关性(MRMR)单视图特征选择算法到多视图设置的改编。

结果

我们报告了使用来自 TCGA 数据库的卵巢癌多组学数据集在预测卵巢癌生存任务上的实验结果。我们的结果表明,多视图模型优于单视图模型(即使用单一类型的组学数据进行训练和测试的模型)和基于两种基线数据融合方法的模型。

结论

我们的结果表明,多视图特征选择在多组学数据的综合分析和预测建模中具有潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a70e/6157248/2016a52533a0/12920_2018_388_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验