通过与基线的差异来数字化组学特征。

Digitizing omics profiles by divergence from a baseline.

机构信息

Department of Oncology, Johns Hopkins University School of Medicine, Baltimore, MD 21205.

Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD 21218.

出版信息

Proc Natl Acad Sci U S A. 2018 May 1;115(18):4545-4552. doi: 10.1073/pnas.1721628115. Epub 2018 Apr 16.

DOI:10.1073/pnas.1721628115

PMID:29666255

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5939095/

Abstract

Data collected from omics technologies have revealed pervasive heterogeneity and stochasticity of molecular states within and between phenotypes. A prominent example of such heterogeneity occurs between genome-wide mRNA, microRNA, and methylation profiles from one individual tumor to another, even within a cancer subtype. However, current methods in bioinformatics, such as detecting differentially expressed genes or CpG sites, are population-based and therefore do not effectively model intersample diversity. Here we introduce a unified theory to quantify sample-level heterogeneity that is applicable to a single omics profile. Specifically, we simplify an omics profile to a digital representation based on the omics profiles from a set of samples from a reference or baseline population (e.g., normal tissues). The state of any subprofile (e.g., expression vector for a subset of genes) is said to be "divergent" if it lies outside the estimated support of the baseline distribution and is consequently interpreted as "dysregulated" relative to that baseline. We focus on two cases: single features (e.g., individual genes) and distinguished subsets (e.g., regulatory pathways). Notably, since the divergence analysis is at the individual sample level, dysregulation can be analyzed probabilistically; for example, one can estimate the probability that a gene or pathway is divergent in some population. Finally, the reduction in complexity facilitates a more "personalized" and biologically interpretable analysis of variation, as illustrated by experiments involving tissue characterization, disease detection and progression, and disease-pathway associations.

摘要

组学技术所收集的数据揭示了分子状态在表型内和表型之间普遍存在的异质性和随机性。这种异质性的一个突出例子发生在个体肿瘤之间的全基因组 mRNA、microRNA 和甲基化谱之间，即使在癌症亚型内也是如此。然而，生物信息学中的当前方法，如检测差异表达的基因或 CpG 位点，是基于群体的，因此不能有效地模拟样本间的多样性。在这里，我们引入了一种统一的理论来量化样本水平的异质性，该理论适用于单个组学图谱。具体来说，我们将组学图谱简化为基于参考或基线人群（例如正常组织）中一组样本的组学图谱的数字表示。如果任何子图谱（例如，一组基因的表达向量）的状态位于估计的基线分布的支持范围之外，则表示该状态“发散”，并且相对于该基线被解释为“失调”。我们专注于两种情况：单个特征（例如，单个基因）和有区别的子集（例如，调控途径）。值得注意的是，由于发散分析是在单个样本水平上进行的，因此可以对失调进行概率分析；例如，可以估计某个基因或途径在某些人群中发散的概率。最后，通过实验涉及组织特征描述、疾病检测和进展以及疾病途径关联，简化复杂性促进了更“个性化”和生物学可解释的变异分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9eb/5939095/e8a4ef87a672/pnas.1721628115fig01.jpg

相似文献

Digitizing omics profiles by divergence from a baseline.通过与基线的差异来数字化组学特征。

Proc Natl Acad Sci U S A. 2018 May 1;115(18):4545-4552. doi: 10.1073/pnas.1721628115. Epub 2018 Apr 16.

Preface on application of omics technologies in cancer biology and therapy.组学技术在癌症生物学与治疗中的应用前言

Cancer Lett. 2016 Nov 1;382(1):A1. doi: 10.1016/j.canlet.2016.10.001.

High-throughput «Omics» technologies: New tools for the study of triple-negative breast cancer.高通量“组学”技术：三阴性乳腺癌研究的新工具。

Cancer Lett. 2016 Nov 1;382(1):77-85. doi: 10.1016/j.canlet.2016.03.001. Epub 2016 Mar 7.

BioVLAB-mCpG-SNP-EXPRESS: A system for multi-level and multi-perspective analysis and exploration of DNA methylation, sequence variation (SNPs), and gene expression from multi-omics data.BioVLAB-mCpG-SNP-EXPRESS：一个用于从多组学数据中对DNA甲基化、序列变异（单核苷酸多态性）和基因表达进行多层次、多视角分析与探索的系统。

Methods. 2016 Dec 1;111:64-71. doi: 10.1016/j.ymeth.2016.07.019. Epub 2016 Jul 28.

Differential Expression and Functional Analysis of High-Throughput -Omics Data Using Open Source Tools.使用开源工具对高通量组学数据进行差异表达分析和功能分析

Methods Mol Biol. 2017;1537:327-345. doi: 10.1007/978-1-4939-6685-1_19.

Detecting discordance enrichment among a series of two-sample genome-wide expression data sets.检测一系列双样本全基因组表达数据集之间的不一致性富集情况。

BMC Genomics. 2017 Jan 25;18(Suppl 1):1050. doi: 10.1186/s12864-016-3265-2.

MOBCdb: a comprehensive database integrating multi-omics data on breast cancer for precision medicine.MOBCdb：一个综合的乳腺癌多组学数据库，用于精准医学。

Breast Cancer Res Treat. 2018 Jun;169(3):625-632. doi: 10.1007/s10549-018-4708-z. Epub 2018 Feb 10.

Omics-Based Strategies in Precision Medicine: Toward a Paradigm Shift in Inborn Errors of Metabolism Investigations.精准医学中基于组学的策略：代谢性遗传病研究范式的转变

Int J Mol Sci. 2016 Sep 14;17(9):1555. doi: 10.3390/ijms17091555.

The Need for Multi-Omics Biomarker Signatures in Precision Medicine.精准医学中多组学生物标志物特征的必要性。

Int J Mol Sci. 2019 Sep 26;20(19):4781. doi: 10.3390/ijms20194781.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学：基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

引用本文的文献

PhosCancer: A comprehensive database for investigating protein phosphorylation in human cancer.PhosCancer：一个用于研究人类癌症中蛋白质磷酸化的综合数据库。

iScience. 2024 Sep 27;27(11):111060. doi: 10.1016/j.isci.2024.111060. eCollection 2024 Nov 15.

CellBiAge: Improved single-cell age classification using data binarization.CellBiAge：通过数据二值化提高单细胞年龄分类。

Cell Rep. 2023 Dec 26;42(12):113500. doi: 10.1016/j.celrep.2023.113500. Epub 2023 Nov 30.

Transcriptomic Harmonization as the Way for Suppressing Cross-Platform Bias and Batch Effect.转录组协调作为抑制跨平台偏差和批次效应的方法

Biomedicines. 2022 Sep 18;10(9):2318. doi: 10.3390/biomedicines10092318.

Comprehensive Analysis of Ubiquitously Expressed Genes in Humans from A Data-driven Perspective.从数据驱动的角度综合分析人类广泛表达的基因。

Genomics Proteomics Bioinformatics. 2023 Feb;21(1):164-176. doi: 10.1016/j.gpb.2021.08.017. Epub 2022 May 13.

Efficient representations of tumor diversity with paired DNA-RNA aberrations.具有配对 DNA-RNA 异常的肿瘤多样性的有效表示。

PLoS Comput Biol. 2021 Jun 11;17(6):e1008944. doi: 10.1371/journal.pcbi.1008944. eCollection 2021 Jun.

Multi-Omics Model Applied to Cancer Genetics.多组学模型在癌症遗传学中的应用。

Int J Mol Sci. 2021 May 27;22(11):5751. doi: 10.3390/ijms22115751.

An R package for divergence analysis of omics data.一个用于组学数据分歧分析的 R 包。

PLoS One. 2021 Apr 5;16(4):e0249002. doi: 10.1371/journal.pone.0249002. eCollection 2021.

Identifying Personalized Metabolic Signatures in Breast Cancer.识别乳腺癌中的个性化代谢特征

Metabolites. 2020 Dec 30;11(1):20. doi: 10.3390/metabo11010020.

Computational Oncology in the Multi-Omics Era: State of the Art.多组学时代的计算肿瘤学：现状

Front Oncol. 2020 Apr 7;10:423. doi: 10.3389/fonc.2020.00423. eCollection 2020.

Precision Medicine in Pancreatic Disease-Knowledge Gaps and Research Opportunities: Summary of a National Institute of Diabetes and Digestive and Kidney Diseases Workshop.胰腺疾病中的精准医学——知识差距与研究机遇：美国国立糖尿病、消化和肾脏疾病研究所研讨会综述

Pancreas. 2019 Nov/Dec;48(10):1250-1258. doi: 10.1097/MPA.0000000000001412.

本文引用的文献

Integrated Analysis of Whole-Genome ChIP-Seq and RNA-Seq Data of Primary Head and Neck Tumor Samples Associates HPV Integration Sites with Open Chromatin Marks.原发性头颈肿瘤样本的全基因组ChIP-Seq和RNA-Seq数据的综合分析将人乳头瘤病毒（HPV）整合位点与开放染色质标记联系起来。

Cancer Res. 2017 Dec 1;77(23):6538-6550. doi: 10.1158/0008-5472.CAN-17-0833. Epub 2017 Sep 25.

A wellness study of 108 individuals using personal, dense, dynamic data clouds.一项针对108名个体的健康研究，使用个人、密集、动态的数据云。

Nat Biotechnol. 2017 Aug;35(8):747-756. doi: 10.1038/nbt.3870. Epub 2017 Jul 17.

Development and validation of a spontaneous preterm delivery predictor in asymptomatic women.无症状孕妇自发性早产预测因子的建立和验证。

Am J Obstet Gynecol. 2016 May;214(5):633.e1-633.e24. doi: 10.1016/j.ajog.2016.02.001. Epub 2016 Feb 11.

The Molecular Signatures Database (MSigDB) hallmark gene set collection.分子特征数据库（MSigDB）标志性基因集集合。

Cell Syst. 2015 Dec 23;1(6):417-425. doi: 10.1016/j.cels.2015.12.004.

Gene Expression Signatures Based on Variability can Robustly Predict Tumor Progression and Prognosis.基于变异性的基因表达特征能够可靠地预测肿瘤进展和预后。

Cancer Inform. 2015 Jun 7;14:71-81. doi: 10.4137/CIN.S23862. eCollection 2015.

Tissue-based Genomics Augments Post-prostatectomy Risk Stratification in a Natural History Cohort of Intermediate- and High-Risk Men.基于组织的基因组学在中高危男性自然史队列中增强了前列腺切除术后风险分层。

Eur Urol. 2016 Jan;69(1):157-65. doi: 10.1016/j.eururo.2015.05.042. Epub 2015 Jun 6.

Learning dysregulated pathways in cancers from differential variability analysis.通过差异变异性分析了解癌症中失调的信号通路。

Cancer Inform. 2014 Oct 23;13(Suppl 5):61-7. doi: 10.4137/CIN.S14066. eCollection 2014.

Individual-level analysis of differential expression of genes and pathways for personalized medicine.用于个性化医疗的基因和通路差异表达的个体水平分析。

Bioinformatics. 2015 Jan 1;31(1):62-8. doi: 10.1093/bioinformatics/btu522. Epub 2014 Aug 26.

A blood-based proteomic classifier for the molecular characterization of pulmonary nodules.一种基于血液的蛋白质组学分类器，用于肺结节的分子特征分析。

Sci Transl Med. 2013 Oct 16;5(207):207ra142. doi: 10.1126/scitranslmed.3007013.

The Cancer Genome Atlas Pan-Cancer analysis project.癌症基因组图谱泛癌分析项目。

Nat Genet. 2013 Oct;45(10):1113-20. doi: 10.1038/ng.2764.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

通过与基线的差异来数字化组学特征。

Digitizing omics profiles by divergence from a baseline.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献