一种基于多组学数据集的综合插补方法。

An integrative imputation method based on multi-omics datasets.

作者信息

Lin Dongdong, Zhang Jigang, Li Jingyao, Xu Chao, Deng Hong-Wen, Wang Yu-Ping

机构信息

Department of Biomedical Engineering, Tulane University, New Orleans, LA, 70118, USA.

Center for Bioinformatics and Genomics, Tulane University, New Orleans, LA, 70112, USA.

出版信息

BMC Bioinformatics. 2016 Jun 21;17:247. doi: 10.1186/s12859-016-1122-6.

DOI:10.1186/s12859-016-1122-6

PMID:27329642

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4915152/

Abstract

BACKGROUND

Integrative analysis of multi-omics data is becoming increasingly important to unravel functional mechanisms of complex diseases. However, the currently available multi-omics datasets inevitably suffer from missing values due to technical limitations and various constrains in experiments. These missing values severely hinder integrative analysis of multi-omics data. Current imputation methods mainly focus on using single omics data while ignoring biological interconnections and information imbedded in multi-omics data sets.

RESULTS

In this study, a novel multi-omics imputation method was proposed to integrate multiple correlated omics datasets for improving the imputation accuracy. Our method was designed to: 1) combine the estimates of missing value from individual omics data itself as well as from other omics, and 2) simultaneously impute multiple missing omics datasets by an iterative algorithm. We compared our method with five imputation methods using single omics data at different noise levels, sample sizes and data missing rates. The results demonstrated the advantage and efficiency of our method, consistently in terms of the imputation error and the recovery of mRNA-miRNA network structure.

CONCLUSIONS

We concluded that our proposed imputation method can utilize more biological information to minimize the imputation error and thus can improve the performance of downstream analysis such as genetic regulatory network construction.

摘要

背景

多组学数据的综合分析对于揭示复杂疾病的功能机制变得越来越重要。然而，由于技术限制和实验中的各种约束，目前可用的多组学数据集不可避免地存在缺失值。这些缺失值严重阻碍了多组学数据的综合分析。当前的插补方法主要集中在使用单一组学数据，而忽略了多组学数据集中嵌入的生物联系和信息。

结果

在本研究中，提出了一种新颖的多组学插补方法，以整合多个相关的组学数据集，提高插补准确性。我们的方法旨在：1）结合来自单个组学数据本身以及其他组学的缺失值估计，2）通过迭代算法同时插补多个缺失的组学数据集。我们在不同噪声水平、样本大小和数据缺失率下，将我们的方法与使用单一组学数据的五种插补方法进行了比较。结果在插补误差和mRNA- miRNA网络结构恢复方面一致地证明了我们方法的优势和效率。

结论

我们得出结论，我们提出的插补方法可以利用更多的生物信息来最小化插补误差，从而可以提高下游分析（如基因调控网络构建）的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f990/4915152/469f4aee80d0/12859_2016_1122_Fig1_HTML.jpg

相似文献

An integrative imputation method based on multi-omics datasets.一种基于多组学数据集的综合插补方法。

BMC Bioinformatics. 2016 Jun 21;17:247. doi: 10.1186/s12859-016-1122-6.

A Review of Integrative Imputation for Multi-Omics Datasets.多组学数据集的整合插补综述

Front Genet. 2020 Oct 15;11:570255. doi: 10.3389/fgene.2020.570255. eCollection 2020.

Handling missing rows in multi-omics data integration: multiple imputation in multiple factor analysis framework.多组学数据整合中缺失行的处理：多因素分析框架下的多重填补

BMC Bioinformatics. 2016 Oct 3;17(1):402. doi: 10.1186/s12859-016-1273-5.

An efficient ensemble method for missing value imputation in microarray gene expression data.一种用于微阵列基因表达数据中缺失值插补的有效集成方法。

BMC Bioinformatics. 2021 Apr 13;22(1):188. doi: 10.1186/s12859-021-04109-4.

Missing value imputation for microRNA expression data by using a GO-based similarity measure.基于基因本体（GO）相似性度量的微小RNA表达数据缺失值插补

BMC Bioinformatics. 2016 Jan 11;17 Suppl 1(Suppl 1):10. doi: 10.1186/s12859-015-0853-0.

A Systemic Analysis of Transcriptomic and Epigenomic Data To Reveal Regulation Patterns for Complex Disease.基于转录组和表观基因组数据的系统分析揭示复杂疾病的调控模式。

G3 (Bethesda). 2017 Jul 5;7(7):2271-2279. doi: 10.1534/g3.117.042408.

Deep Learning Methods for Omics Data Imputation.用于组学数据插补的深度学习方法。

Biology (Basel). 2023 Oct 7;12(10):1313. doi: 10.3390/biology12101313.

Advanced methods for missing values imputation based on similarity learning.基于相似性学习的缺失值插补先进方法。

PeerJ Comput Sci. 2021 Jul 21;7:e619. doi: 10.7717/peerj-cs.619. eCollection 2021.

TOBMI: trans-omics block missing data imputation using a k-nearest neighbor weighted approach.TOBMI：基于 k 近邻加权方法的组学缺失数据填补。

Bioinformatics. 2019 Apr 15;35(8):1278-1283. doi: 10.1093/bioinformatics/bty796.

Integrative analysis of transcriptomic and proteomic data of Shewanella oneidensis: missing value imputation using temporal datasets.嗜温栖热放线菌转录组学和蛋白质组学数据的综合分析：利用时间数据集进行缺失值插补

Mol Biosyst. 2011 Apr;7(4):1093-104. doi: 10.1039/c0mb00260g. Epub 2011 Jan 7.

引用本文的文献

Multi-omics Integrative Analysis for Incomplete Data Using Weighted -Value Adjustment Approaches.使用加权值调整方法对不完整数据进行多组学综合分析。

J Agric Biol Environ Stat. 2025;30(3):601-617. doi: 10.1007/s13253-024-00603-3. Epub 2024 Feb 28.

Artificial intelligence-driven clinical decision support systems for early detection and precision therapy in oral cancer: a mini review.用于口腔癌早期检测和精准治疗的人工智能驱动临床决策支持系统：综述

Front Oral Health. 2025 Apr 28;6:1592428. doi: 10.3389/froh.2025.1592428. eCollection 2025.

Surface Modification of Gold Nanoparticle Impacts Distinct Lipid Metabolism.金纳米颗粒的表面修饰影响不同的脂质代谢。

Molecules. 2025 Apr 11;30(8):1727. doi: 10.3390/molecules30081727.

Incomplete time-series gene expression in integrative study for islet autoimmunity prediction.整合研究中不完全时间序列基因表达对胰岛自身免疫预测。

Brief Bioinform. 2023 Jan 19;24(1). doi: 10.1093/bib/bbac537.

Metabolomics and modelling approaches for systems metabolic engineering.用于系统代谢工程的代谢组学和建模方法。

Metab Eng Commun. 2022 Oct 15;15:e00209. doi: 10.1016/j.mec.2022.e00209. eCollection 2022 Dec.

Strengthening Causal Inference in Exposomics Research: Application of Genetic Data and Methods.增强暴露组学研究中的因果推断：遗传数据和方法的应用。

Environ Health Perspect. 2022 May;130(5):55001. doi: 10.1289/EHP9098. Epub 2022 May 9.

Identification of key candidate genes and pathways associated with colorectal aberrant crypt foci-to-adenoma-to-carcinoma progression.与结直肠癌异常隐窝灶向腺瘤再向癌进展相关的关键候选基因和通路的鉴定。

Gastroenterol Hepatol Bed Bench. 2021 Fall;14(Suppl1):S41-S50.

TiMEG: an integrative statistical method for partially missing multi-omics data.TiMEG：一种用于部分缺失多组学数据的综合统计方法。

Sci Rep. 2021 Dec 15;11(1):24077. doi: 10.1038/s41598-021-03034-z.

An efficient ensemble method for missing value imputation in microarray gene expression data.一种用于微阵列基因表达数据中缺失值插补的有效集成方法。

BMC Bioinformatics. 2021 Apr 13;22(1):188. doi: 10.1186/s12859-021-04109-4.

A novel computational strategy for DNA methylation imputation using mixture regression model (MRM).一种基于混合回归模型（MRM）的新型 DNA 甲基化推断计算策略。

BMC Bioinformatics. 2020 Dec 1;21(1):552. doi: 10.1186/s12859-020-03865-z.

本文引用的文献

By the company they keep: interaction networks define the binding ability of transcription factors.物以类聚：相互作用网络决定转录因子的结合能力。

Nucleic Acids Res. 2015 Oct 30;43(19):e125. doi: 10.1093/nar/gkv607. Epub 2015 Jun 18.

Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements.利用甲基化标记、基因组位置和DNA调控元件预测全基因组DNA甲基化

Genome Biol. 2015 Jan 24;16(1):14. doi: 10.1186/s13059-015-0581-9.

STRING v10: protein-protein interaction networks, integrated over the tree of life.STRING v10：整合了整个生命之树的蛋白质-蛋白质相互作用网络。

Nucleic Acids Res. 2015 Jan;43(Database issue):D447-52. doi: 10.1093/nar/gku1003. Epub 2014 Oct 28.

Investigating the effects of imputation methods for modelling gene networks using a dynamic bayesian network from gene expression data.利用基因表达数据的动态贝叶斯网络研究插补方法对基因网络建模的影响。

Malays J Med Sci. 2014 Mar;21(2):20-7.

Integrated analysis of transcriptomic and proteomic data.转录组和蛋白质组数据的综合分析。

Curr Genomics. 2013 Apr;14(2):91-110. doi: 10.2174/1389202911314020003.

An integrated hierarchical Bayesian approach to normalizing left-censored microRNA microarray data.一种整合的层次贝叶斯方法，用于归一化左截断的 microRNA 微阵列数据。

BMC Genomics. 2013 Jul 26;14:507. doi: 10.1186/1471-2164-14-507.

Sequential projection pursuit principal component analysis--dealing with missing data associated with new -omics technologies.序贯投影寻踪主成分分析——处理与新组学技术相关的缺失数据。

Biotechniques. 2013 Mar;54(3):165-8. doi: 10.2144/000113978.

Prediction and Characterization of Missing Proteomic Data in Desulfovibrio vulgaris.普通脱硫弧菌中缺失蛋白质组数据的预测与表征

Comp Funct Genomics. 2011;2011:780973. doi: 10.1155/2011/780973. Epub 2011 May 4.

Mol Biosyst. 2011 Apr;7(4):1093-104. doi: 10.1039/c0mb00260g. Epub 2011 Jan 7.

Missing value imputation for gene expression data: computational techniques to recover missing data from available information.基因表达数据的缺失值填补：从现有信息中恢复缺失数据的计算技术。

Brief Bioinform. 2011 Sep;12(5):498-513. doi: 10.1093/bib/bbq080. Epub 2010 Dec 14.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种基于多组学数据集的综合插补方法。

An integrative imputation method based on multi-omics datasets.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献