Suppr超能文献

通过全转录组非负矩阵分解寻找新型数据驱动生物标志物的模式标记与GWCoGAPS

PatternMarkers & GWCoGAPS for novel data-driven biomarkers via whole transcriptome NMF.

作者信息

Stein-O'Brien Genevieve L, Carey Jacob L, Lee Wai Shing, Considine Michael, Favorov Alexander V, Flam Emily, Guo Theresa, Li Sijia, Marchionni Luigi, Sherman Thomas, Sivy Shawn, Gaykalova Daria A, McKay Ronald D, Ochs Michael F, Colantuoni Carlo, Fertig Elana J

机构信息

McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins School of Medicine, Baltimore, MD, USA.

Lieber Institute for Brain Development, Baltimore, MD, USA.

出版信息

Bioinformatics. 2017 Jun 15;33(12):1892-1894. doi: 10.1093/bioinformatics/btx058.

Abstract

SUMMARY

Non-negative Matrix Factorization (NMF) algorithms associate gene expression with biological processes (e.g. time-course dynamics or disease subtypes). Compared with univariate associations, the relative weights of NMF solutions can obscure biomarkers. Therefore, we developed a novel patternMarkers statistic to extract genes for biological validation and enhanced visualization of NMF results. Finding novel and unbiased gene markers with patternMarkers requires whole-genome data. Therefore, we also developed Genome-Wide CoGAPS Analysis in Parallel Sets (GWCoGAPS), the first robust whole genome Bayesian NMF using the sparse, MCMC algorithm, CoGAPS. Additionally, a manual version of the GWCoGAPS algorithm contains analytic and visualization tools including patternMatcher, a Shiny web application. The decomposition in the manual pipeline can be replaced with any NMF algorithm, for further generalization of the software. Using these tools, we find granular brain-region and cell-type specific signatures with corresponding biomarkers in GTEx data, illustrating GWCoGAPS and patternMarkers ascertainment of data-driven biomarkers from whole-genome data.

AVAILABILITY AND IMPLEMENTATION

PatternMarkers & GWCoGAPS are in the CoGAPS Bioconductor package (3.5) under the GPL license.

CONTACT

gsteinobrien@jhmi.edu or ccolantu@jhmi.edu or ejfertig@jhmi.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

摘要

非负矩阵分解(NMF)算法将基因表达与生物过程(如时间进程动态或疾病亚型)相关联。与单变量关联相比,NMF解决方案的相对权重可能会掩盖生物标志物。因此,我们开发了一种新颖的patternMarkers统计量,用于提取基因以进行生物学验证并增强NMF结果的可视化。使用patternMarkers找到新颖且无偏差的基因标志物需要全基因组数据。因此,我们还开发了并行集全基因组CoGAPS分析(GWCoGAPS),这是首个使用稀疏MCMC算法CoGAPS的强大全基因组贝叶斯NMF。此外,GWCoGAPS算法的手动版本包含分析和可视化工具,包括一个Shiny网络应用程序patternMatcher。手动流程中的分解可以用任何NMF算法替代,以进一步推广该软件。使用这些工具,我们在GTEx数据中发现了具有相应生物标志物的精细脑区和细胞类型特异性特征,说明了GWCoGAPS和patternMarkers从全基因组数据中确定数据驱动的生物标志物的能力。

可用性与实现方式

PatternMarkers和GWCoGAPS在GPL许可下的CoGAPS Bioconductor包(3.5)中。

联系方式

gsteinobrien@jhmi.educcolantu@jhmi.eduejfertig@jhmi.edu

补充信息

补充数据可在《生物信息学》在线获取。

相似文献

引用本文的文献

本文引用的文献

3
Matrix Factorization for Transcriptional Regulatory Network Inference.用于转录调控网络推断的矩阵分解
IEEE Symp Comput Intell Bioinforma Comput Biol Proc. 2012 May;2012:387-396. doi: 10.1109/CIBCB.2012.6217256.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验