TiMEG：一种用于部分缺失多组学数据的综合统计方法。

TiMEG: an integrative statistical method for partially missing multi-omics data.

机构信息

Human Genetics Unit, Indian Statistical Institute, Kolkata, 700108, India.

Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, 38105, USA.

出版信息

Sci Rep. 2021 Dec 15;11(1):24077. doi: 10.1038/s41598-021-03034-z.

DOI:10.1038/s41598-021-03034-z

PMID:34911979

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8674330/

Abstract

Multi-omics data integration is widely used to understand the genetic architecture of disease. In multi-omics association analysis, data collected on multiple omics for the same set of individuals are immensely important for biomarker identification. But when the sample size of such data is limited, the presence of partially missing individual-level observations poses a major challenge in data integration. More often, genotype data are available for all individuals under study but gene expression and/or methylation information are missing for different subsets of those individuals. Here, we develop a statistical model TiMEG, for the identification of disease-associated biomarkers in a case-control paradigm by integrating the above-mentioned data types, especially, in presence of missing omics data. Based on a likelihood approach, TiMEG exploits the inter-relationship among multiple omics data to capture weaker signals, that remain unidentified in single-omic analysis or common imputation-based methods. Its application on a real tuberous sclerosis dataset identified functionally relevant genes in the disease pathway.

摘要

多组学数据整合被广泛用于理解疾病的遗传结构。在多组学关联分析中，对于同一组个体的多组学数据的收集对于生物标志物的识别非常重要。但是，当此类数据的样本量有限时，部分个体水平观测值的缺失会给数据整合带来重大挑战。通常，所有研究个体的基因型数据都是可用的，但对于这些个体的不同子集，基因表达和/或甲基化信息是缺失的。在这里，我们开发了一个统计模型 TiMEG，用于在病例对照范式中通过整合上述数据类型来识别疾病相关的生物标志物，特别是在存在缺失的组学数据的情况下。基于似然方法，TiMEG 利用多组学数据之间的相互关系来捕获在单组学分析或常见的基于插补的方法中未识别的较弱信号。它在一个真实的结节性硬化症数据集上的应用，鉴定了疾病通路中具有功能相关性的基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d7a/8674330/16c67ee2f1eb/41598_2021_3034_Fig1_HTML.jpg

相似文献

TiMEG: an integrative statistical method for partially missing multi-omics data.

Sci Rep. 2021 Dec 15;11(1):24077. doi: 10.1038/s41598-021-03034-z.

An integrative imputation method based on multi-omics datasets.

BMC Bioinformatics. 2016 Jun 21;17:247. doi: 10.1186/s12859-016-1122-6.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

A Review of Integrative Imputation for Multi-Omics Datasets.

Front Genet. 2020 Oct 15;11:570255. doi: 10.3389/fgene.2020.570255. eCollection 2020.

Assisted clustering of gene expression data using regulatory data from partially overlapping sets of individuals.

BMC Genomics. 2022 Dec 10;23(1):819. doi: 10.1186/s12864-022-09026-1.

Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework.

BMC Bioinformatics. 2016 Oct 3;17(1):402. doi: 10.1186/s12859-016-1273-5.

Multi-omics regulatory network inference in the presence of missing data.

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad309.

Missing data in multi-omics integration: Recent advances through artificial intelligence.

Front Artif Intell. 2023 Feb 9;6:1098308. doi: 10.3389/frai.2023.1098308. eCollection 2023.

HCNM: Heterogeneous Correlation Network Model for Multi-level Integrative Study of Multi-omics Data for Cancer Subtype Prediction.

Annu Int Conf IEEE Eng Med Biol Soc. 2021 Nov;2021:1880-1886. doi: 10.1109/EMBC46164.2021.9630781.

Integrative Analysis of Multi-omics Data for Discovery and Functional Studies of Complex Human Diseases.

Adv Genet. 2016;93:147-90. doi: 10.1016/bs.adgen.2015.11.004. Epub 2016 Jan 25.

引用本文的文献

Harnessing Multi-Omics and Predictive Modeling for Climate-Resilient Crop Breeding: From Genomes to Fields.

Genes (Basel). 2025 Jul 10;16(7):809. doi: 10.3390/genes16070809.

Multi-omics analysis in inclusion body myositis identifies mir-16 responsible for HLA overexpression.

Orphanet J Rare Dis. 2025 Jan 15;20(1):27. doi: 10.1186/s13023-024-03526-x.

SGUQ: Staged Graph Convolution Neural Network for Alzheimer's Disease Diagnosis using Multi-Omics Data.

ArXiv. 2024 Oct 14:arXiv:2410.11046v1.

An updated overview of the search for biomarkers of osteoporosis based on human proteomics.

J Orthop Translat. 2024 Oct 3;49:37-48. doi: 10.1016/j.jot.2024.08.015. eCollection 2024 Nov.

Multi Omics Applications in Biological Systems.

Curr Issues Mol Biol. 2024 Jun 11;46(6):5777-5793. doi: 10.3390/cimb46060345.

CLCLSA: Cross-omics linked embedding with contrastive learning and self attention for integration with incomplete multi-omics data.

Comput Biol Med. 2024 Mar;170:108058. doi: 10.1016/j.compbiomed.2024.108058. Epub 2024 Jan 28.

How is Big Data reshaping preclinical aging research?

Lab Anim (NY). 2023 Dec;52(12):289-314. doi: 10.1038/s41684-023-01286-y. Epub 2023 Nov 28.

CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self Attention for multi-omics integration with incomplete multi-omics data.

Res Sq. 2023 May 2:rs.3.rs-2768563. doi: 10.21203/rs.3.rs-2768563/v1.

CLCLSA: Cross-omics Linked embedding with Contrastive Learning and Self Attention for multi-omics integration with incomplete multi-omics data.

ArXiv. 2023 Apr 12:arXiv:2304.05542v1.

Missing data in multi-omics integration: Recent advances through artificial intelligence.

Front Artif Intell. 2023 Feb 9;6:1098308. doi: 10.3389/frai.2023.1098308. eCollection 2023.

本文引用的文献

A network embedding based method for partial multi-omics integration in cancer subtyping.

Methods. 2021 Aug;192:67-76. doi: 10.1016/j.ymeth.2020.08.001. Epub 2020 Aug 14.

Multi-omics Data Integration, Interpretation, and Its Application.

Bioinform Biol Insights. 2020 Jan 31;14:1177932219899051. doi: 10.1177/1177932219899051. eCollection 2020.

Making multi-omics data accessible to researchers.

Sci Data. 2019 Oct 31;6(1):251. doi: 10.1038/s41597-019-0258-4.

TSC1/mTOR-controlled metabolic-epigenetic cross talk underpins DC control of CD8+ T-cell homeostasis.

PLoS Biol. 2019 Aug 21;17(8):e3000420. doi: 10.1371/journal.pbio.3000420. eCollection 2019 Aug.

Opportunities and challenges for transcriptome-wide association studies.

Nat Genet. 2019 Apr;51(4):592-599. doi: 10.1038/s41588-019-0385-z. Epub 2019 Mar 29.

A Selective Review of Multi-Level Omics Data Integration Using Variable Selection.

High Throughput. 2019 Jan 18;8(1):4. doi: 10.3390/ht8010004.

Multi-omic and multi-view clustering algorithms: review and cancer benchmark.

Nucleic Acids Res. 2018 Nov 16;46(20):10546-10562. doi: 10.1093/nar/gky889.

A powerful method to integrate genotype and gene expression data for dissecting the genetic architecture of a disease.

Genomics. 2019 Dec;111(6):1387-1394. doi: 10.1016/j.ygeno.2018.09.011. Epub 2018 Oct 1.

Bayesian integrative model for multi-omics data with missingness.

Bioinformatics. 2018 Nov 15;34(22):3801-3808. doi: 10.1093/bioinformatics/bty775.

Integration of multi-omics data for integrative gene regulatory network inference.

Int J Data Min Bioinform. 2017;18(3):223-239. doi: 10.1504/IJDMB.2017.10008266. Epub 2017 Oct 3.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

TiMEG：一种用于部分缺失多组学数据的综合统计方法。

TiMEG: an integrative statistical method for partially missing multi-omics data.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献