转录异质性的数学建模识别复杂组织中的新型标志物和亚群。

Mathematical modelling of transcriptional heterogeneity identifies novel markers and subpopulations in complex tissues.

作者信息

Wang Niya, Hoffman Eric P, Chen Lulu, Chen Li, Zhang Zhen, Liu Chunyu, Yu Guoqiang, Herrington David M, Clarke Robert, Wang Yue

机构信息

Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA.

Research Center for Genetic Medicine, Children's National Medical Center, Washington, DC 20007, USA.

出版信息

Sci Rep. 2016 Jan 7;6:18909. doi: 10.1038/srep18909.

DOI:10.1038/srep18909

PMID:26739359

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4703969/

Abstract

Tissue heterogeneity is both a major confounding factor and an underexploited information source. While a handful of reports have demonstrated the potential of supervised computational methods to deconvolute tissue heterogeneity, these approaches require a priori information on the marker genes or composition of known subpopulations. To address the critical problem of the absence of validated marker genes for many (including novel) subpopulations, we describe convex analysis of mixtures (CAM), a fully unsupervised in silico method, for identifying subpopulation marker genes directly from the original mixed gene expressions in scatter space that can improve molecular analyses in many biological contexts. Validated with predesigned mixtures, CAM on the gene expression data from peripheral leukocytes, brain tissue, and yeast cell cycle, revealed novel marker genes that were otherwise undetectable using existing methods. Importantly, CAM requires no a priori information on the number, identity, or composition of the subpopulations present in mixed samples, and does not require the presence of pure subpopulations in sample space. This advantage is significant in that CAM can achieve all of its goals using only a small number of heterogeneous samples, and is more powerful to distinguish between phenotypically similar subpopulations.

摘要

组织异质性既是一个主要的混杂因素，也是一个未被充分利用的信息来源。虽然少数报告已经证明了监督计算方法在解卷积组织异质性方面的潜力，但这些方法需要关于标记基因或已知亚群组成的先验信息。为了解决许多（包括新的）亚群缺乏经过验证的标记基因这一关键问题，我们描述了混合物的凸分析（CAM），这是一种完全无监督的计算机方法，用于直接从散点空间中的原始混合基因表达中识别亚群标记基因，从而可以在许多生物学背景下改进分子分析。通过预先设计的混合物进行验证，对来自外周血白细胞、脑组织和酵母细胞周期的基因表达数据进行CAM分析，揭示了使用现有方法无法检测到的新标记基因。重要的是，CAM不需要关于混合样本中存在的亚群数量、身份或组成的先验信息，也不需要样本空间中存在纯亚群。这一优势非常显著，因为CAM仅使用少量异质样本就能实现其所有目标，并且在区分表型相似的亚群方面更强大。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/22ef/4703969/f3a7615b19e0/srep18909-f1.jpg

相似文献

Mathematical modelling of transcriptional heterogeneity identifies novel markers and subpopulations in complex tissues.转录异质性的数学建模识别复杂组织中的新型标志物和亚群。

Sci Rep. 2016 Jan 7;6:18909. doi: 10.1038/srep18909.

Mathematical Modeling and Deconvolution of Molecular Heterogeneity Identifies Novel Subpopulations in Complex Tissues.分子异质性的数学建模与反卷积识别复杂组织中的新亚群

Methods Mol Biol. 2018;1751:223-236. doi: 10.1007/978-1-4939-7710-9_16.

Semi-CAM: A semi-supervised deconvolution method for bulk transcriptomic data with partial marker gene information.Semi-CAM：一种具有部分标记基因信息的批量转录组数据的半监督解卷积方法。

Sci Rep. 2020 Mar 25;10(1):5434. doi: 10.1038/s41598-020-62330-2.

Computational de novo discovery of distinguishing genes for biological processes and cell types in complex tissues.计算从头发现复杂组织中生物过程和细胞类型的区分基因。

PLoS One. 2018 Mar 1;13(3):e0193067. doi: 10.1371/journal.pone.0193067. eCollection 2018.

Single-cell gene expression analysis reveals diversity among human spermatogonia.单细胞基因表达分析揭示了人类精原细胞之间的多样性。

Mol Hum Reprod. 2017 Feb 10;23(2):79-90. doi: 10.1093/molehr/gaw079.

Development and molecular composition of the hepatic progenitor cell niche.肝祖细胞微环境的发育与分子组成

Dan Med J. 2013 May;60(5):B4640.

A computational strategy for predicting lineage specifiers in stem cell subpopulations.一种预测干细胞亚群中谱系决定因子的计算策略。

Stem Cell Res. 2015 Sep;15(2):427-34. doi: 10.1016/j.scr.2015.08.006. Epub 2015 Sep 2.

Semi-supervised Nonnegative Matrix Factorization for gene expression deconvolution: a case study.半监督非负矩阵分解在基因表达解卷积中的应用：案例研究。

Infect Genet Evol. 2012 Jul;12(5):913-21. doi: 10.1016/j.meegid.2011.08.014. Epub 2011 Sep 10.

The beagle dog MicroRNA tissue atlas: identifying translatable biomarkers of organ toxicity.比格犬微小RNA组织图谱：识别可转化的器官毒性生物标志物。

BMC Genomics. 2016 Aug 17;17:649. doi: 10.1186/s12864-016-2958-x.

Vitiligo blood transcriptomics provides new insights into disease mechanisms and identifies potential novel therapeutic targets.白癜风血液转录组学为疾病机制提供了新见解，并确定了潜在的新型治疗靶点。

BMC Genomics. 2017 Jan 28;18(1):109. doi: 10.1186/s12864-017-3510-3.

引用本文的文献

An improved reference library and method for accurate cell-type deconvolution of bulk-tissue miRNA data.一种用于批量组织miRNA数据精确细胞类型反卷积的改进参考文库和方法。

Nat Commun. 2025 Jul 1;16(1):5508. doi: 10.1038/s41467-025-60521-x.

ARTdeConv: adaptive regularized tri-factor non-negative matrix factorization for cell type deconvolution.ARTdeConv：用于细胞类型反卷积的自适应正则化三因子非负矩阵分解

NAR Genom Bioinform. 2025 Apr 26;7(2):lqaf046. doi: 10.1093/nargab/lqaf046. eCollection 2025 Jun.

Assessing transcriptomic heterogeneity of single-cell RNASeq data by bulk-level gene expression data.通过批量水平基因表达数据评估单细胞RNA测序数据的转录组异质性。

BMC Bioinformatics. 2024 Jun 12;25(1):209. doi: 10.1186/s12859-024-05825-3.

CAM3.0: determining cell type composition and expression from bulk tissues with fully unsupervised deconvolution.CAM3.0：通过完全无监督的去卷积从批量组织中确定细胞类型组成和表达。

Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae107.

CLINICAL HETEROGENEITY IN THE AGE OF BIG DATA, ADVANCED ANALYTICS, AND COMPLEXITY THEORY.大数据、高级分析和复杂性理论时代的临床异质性。

Trans Am Clin Climatol Assoc. 2023;133:56-68.

MRI Radiogenomics in Precision Oncology: New Diagnosis and Treatment Method.磁共振影像基因组学在精准肿瘤学中的应用：新的诊断和治疗方法。

Comput Intell Neurosci. 2022 Jul 7;2022:2703350. doi: 10.1155/2022/2703350. eCollection 2022.

COT: an efficient and accurate method for detecting marker genes among many subtypes.COT：一种在多种亚型中检测标记基因的高效且准确的方法。

Bioinform Adv. 2022 May 27;2(1):vbac037. doi: 10.1093/bioadv/vbac037. eCollection 2022.

Proteomic analysis of descending thoracic aorta identifies unique and universal signatures of aneurysm and dissection.胸降主动脉的蛋白质组学分析确定了动脉瘤和夹层的独特及通用特征。

JVS Vasc Sci. 2022 Jan 22;3:85-181. doi: 10.1016/j.jvssci.2022.01.001. eCollection 2022.

Comparative assessment and novel strategy on methods for imputing proteomics data.比较评估和蛋白质组学数据插补方法的新策略。

Sci Rep. 2022 Jan 20;12(1):1067. doi: 10.1038/s41598-022-04938-0.

FSCAM: CAM-Based Feature Selection for Clustering scRNA-seq.FSCAM：基于 CAM 的 scRNA-seq 聚类特征选择。

Interdiscip Sci. 2022 Jun;14(2):394-408. doi: 10.1007/s12539-021-00495-8. Epub 2022 Jan 14.

本文引用的文献

Convex Analysis of Mixtures for Separating Non-negative Well-grounded Sources.凸分析混合模型用于分离非负有界源。

Sci Rep. 2016 Dec 6;6:38350. doi: 10.1038/srep38350.

Prognostic Imaging Biomarkers in Glioblastoma: Development and Independent Validation on the Basis of Multiregion and Quantitative Analysis of MR Images.胶质母细胞瘤的预后影像生物标志物：基于多区域和磁共振图像定量分析的开发与独立验证

Radiology. 2016 Feb;278(2):546-53. doi: 10.1148/radiol.2015150358. Epub 2015 Sep 4.

Inferring biological tasks using Pareto analysis of high-dimensional data.基于高维数据的 Pareto 分析推断生物任务。

Nat Methods. 2015 Mar;12(3):233-5, 3 p following 235. doi: 10.1038/nmeth.3254. Epub 2015 Jan 26.

Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells.单细胞 RNA 测序数据中细胞间异质性的计算分析揭示了细胞的隐藏亚群。

Nat Biotechnol. 2015 Feb;33(2):155-60. doi: 10.1038/nbt.3102. Epub 2015 Jan 19.

Unsupervised deconvolution of dynamic imaging reveals intratumor vascular heterogeneity and repopulation dynamics.动态成像的无监督反卷积揭示了肿瘤内血管异质性和再增殖动力学。

PLoS One. 2014 Nov 7;9(11):e112143. doi: 10.1371/journal.pone.0112143. eCollection 2014.

UNDO: a Bioconductor R package for unsupervised deconvolution of mixed gene expressions in tumor samples.UNDO：一个用于对肿瘤样本中混合基因表达进行无监督反卷积的Bioconductor R软件包。

Bioinformatics. 2015 Jan 1;31(1):137-9. doi: 10.1093/bioinformatics/btu607. Epub 2014 Sep 10.

Influence of tumour micro-environment heterogeneity on therapeutic response.肿瘤微环境异质性对治疗反应的影响。

Nature. 2013 Sep 19;501(7467):346-54. doi: 10.1038/nature12626.

Measuring cell-type specific differential methylation in human brain tissue.测量人类脑组织中细胞类型特异性差异甲基化

Genome Biol. 2013 Aug 30;14(8):R94. doi: 10.1186/gb-2013-14-8-r94.

A self-directed method for cell-type identification and separation of gene expression microarrays.一种用于基因表达微阵列的细胞类型鉴定和分离的自我指导方法。

PLoS Comput Biol. 2013;9(8):e1003189. doi: 10.1371/journal.pcbi.1003189. Epub 2013 Aug 22.

Single-cell sequencing-based technologies will revolutionize whole-organism science.基于单细胞测序的技术将彻底改变整个生物体科学。

Nat Rev Genet. 2013 Sep;14(9):618-30. doi: 10.1038/nrg3542. Epub 2013 Jul 30.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

转录异质性的数学建模识别复杂组织中的新型标志物和亚群。

Mathematical modelling of transcriptional heterogeneity identifies novel markers and subpopulations in complex tissues.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献