Suppr超能文献

转录异质性的数学建模识别复杂组织中的新型标志物和亚群。

Mathematical modelling of transcriptional heterogeneity identifies novel markers and subpopulations in complex tissues.

作者信息

Wang Niya, Hoffman Eric P, Chen Lulu, Chen Li, Zhang Zhen, Liu Chunyu, Yu Guoqiang, Herrington David M, Clarke Robert, Wang Yue

机构信息

Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.

Research Center for Genetic Medicine, Children's National Medical Center, Washington, DC 20007, USA.

出版信息

Sci Rep. 2016 Jan 7;6:18909. doi: 10.1038/srep18909.

Abstract

Tissue heterogeneity is both a major confounding factor and an underexploited information source. While a handful of reports have demonstrated the potential of supervised computational methods to deconvolute tissue heterogeneity, these approaches require a priori information on the marker genes or composition of known subpopulations. To address the critical problem of the absence of validated marker genes for many (including novel) subpopulations, we describe convex analysis of mixtures (CAM), a fully unsupervised in silico method, for identifying subpopulation marker genes directly from the original mixed gene expressions in scatter space that can improve molecular analyses in many biological contexts. Validated with predesigned mixtures, CAM on the gene expression data from peripheral leukocytes, brain tissue, and yeast cell cycle, revealed novel marker genes that were otherwise undetectable using existing methods. Importantly, CAM requires no a priori information on the number, identity, or composition of the subpopulations present in mixed samples, and does not require the presence of pure subpopulations in sample space. This advantage is significant in that CAM can achieve all of its goals using only a small number of heterogeneous samples, and is more powerful to distinguish between phenotypically similar subpopulations.

摘要

组织异质性既是一个主要的混杂因素,也是一个未被充分利用的信息来源。虽然少数报告已经证明了监督计算方法在解卷积组织异质性方面的潜力,但这些方法需要关于标记基因或已知亚群组成的先验信息。为了解决许多(包括新的)亚群缺乏经过验证的标记基因这一关键问题,我们描述了混合物的凸分析(CAM),这是一种完全无监督的计算机方法,用于直接从散点空间中的原始混合基因表达中识别亚群标记基因,从而可以在许多生物学背景下改进分子分析。通过预先设计的混合物进行验证,对来自外周血白细胞、脑组织和酵母细胞周期的基因表达数据进行CAM分析,揭示了使用现有方法无法检测到的新标记基因。重要的是,CAM不需要关于混合样本中存在的亚群数量、身份或组成的先验信息,也不需要样本空间中存在纯亚群。这一优势非常显著,因为CAM仅使用少量异质样本就能实现其所有目标,并且在区分表型相似的亚群方面更强大。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ef/4703969/f3a7615b19e0/srep18909-f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验