• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于微阵列缺失值估计的混合插补方法。

A hybrid imputation approach for microarray missing value estimation.

作者信息

Li Huihui, Zhao Changbo, Shao Fengfeng, Li Guo-Zheng, Wang Xiao

出版信息

BMC Genomics. 2015;16 Suppl 9(Suppl 9):S1. doi: 10.1186/1471-2164-16-S9-S1. Epub 2015 Aug 17.

DOI:10.1186/1471-2164-16-S9-S1
PMID:26330180
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4547405/
Abstract

BACKGROUND

Missing data is an inevitable phenomenon in gene expression microarray experiments due to instrument failure or human error. It has a negative impact on performance of downstream analysis. Technically, most existing approaches suffer from this prevalent problem. Imputation is one of the frequently used methods for processing missing data. Actually many developments have been achieved in the research on estimating missing values. The challenging task is how to improve imputation accuracy for data with a large missing rate.

METHODS

In this paper, induced by the thought of collaborative training, we propose a novel hybrid imputation method, called Recursive Mutual Imputation (RMI). Specifically, RMI exploits global correlation information and local structure in the data, captured by two popular methods, Bayesian Principal Component Analysis (BPCA) and Local Least Squares (LLS), respectively. Mutual strategy is implemented by sharing the estimated data sequences at each recursive process. Meanwhile, we consider the imputation sequence based on the number of missing entries in the target gene. Furthermore, a weight based integrated method is utilized in the final assembling step.

RESULTS

We evaluate RMI with three state-of-art algorithms (BPCA, LLS, Iterated Local Least Squares imputation (ItrLLS)) on four publicly available microarray datasets. Experimental results clearly demonstrate that RMI significantly outperforms comparative methods in terms of Normalized Root Mean Square Error (NRMSE), especially for datasets with large missing rates and less complete genes.

CONCLUSIONS

It is noted that our proposed hybrid imputation approach incorporates both global and local information of microarray genes, which achieves lower NRMSE values against to any single approach only. Besides, this study highlights the need for considering the imputing sequence of missing entries for imputation methods.

摘要

背景

由于仪器故障或人为错误,缺失数据在基因表达微阵列实验中是不可避免的现象。它对下游分析的性能有负面影响。从技术上讲,大多数现有方法都存在这个普遍问题。插补是处理缺失数据常用的方法之一。实际上,在估计缺失值的研究中已经取得了许多进展。具有挑战性的任务是如何提高对缺失率高的数据的插补准确性。

方法

在本文中,受协同训练思想的启发,我们提出了一种新颖的混合插补方法,称为递归互插补(RMI)。具体而言,RMI利用分别由两种流行方法贝叶斯主成分分析(BPCA)和局部最小二乘法(LLS)捕获的数据中的全局相关信息和局部结构。通过在每个递归过程中共享估计的数据序列来实现互插补策略。同时,我们根据目标基因中缺失条目的数量来考虑插补顺序。此外,在最终的组装步骤中使用基于权重的集成方法。

结果

我们在四个公开可用的微阵列数据集上使用三种先进算法(BPCA、LLS、迭代局部最小二乘插补(ItrLLS))对RMI进行了评估。实验结果清楚地表明,在归一化均方根误差(NRMSE)方面,RMI明显优于比较方法,特别是对于缺失率高且完整基因较少的数据集。

结论

需要注意的是,我们提出的混合插补方法结合了微阵列基因的全局和局部信息,相对于任何单一方法都能实现更低的NRMSE值。此外,本研究强调了插补方法需要考虑缺失条目的插补顺序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/489df63558f1/1471-2164-16-S9-S1-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/e6acea674737/1471-2164-16-S9-S1-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/5dcfc83b7e1f/1471-2164-16-S9-S1-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/95b6b7851607/1471-2164-16-S9-S1-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/08d8fd253a85/1471-2164-16-S9-S1-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/628ce7092c22/1471-2164-16-S9-S1-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/014524e10b4a/1471-2164-16-S9-S1-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/95a7c4b3d2ac/1471-2164-16-S9-S1-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/489df63558f1/1471-2164-16-S9-S1-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/e6acea674737/1471-2164-16-S9-S1-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/5dcfc83b7e1f/1471-2164-16-S9-S1-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/95b6b7851607/1471-2164-16-S9-S1-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/08d8fd253a85/1471-2164-16-S9-S1-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/628ce7092c22/1471-2164-16-S9-S1-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/014524e10b4a/1471-2164-16-S9-S1-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/95a7c4b3d2ac/1471-2164-16-S9-S1-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/01c3/4547405/489df63558f1/1471-2164-16-S9-S1-8.jpg

相似文献

1
A hybrid imputation approach for microarray missing value estimation.一种用于微阵列缺失值估计的混合插补方法。
BMC Genomics. 2015;16 Suppl 9(Suppl 9):S1. doi: 10.1186/1471-2164-16-S9-S1. Epub 2015 Aug 17.
2
Robust imputation method for missing values in microarray data.微阵列数据中缺失值的稳健插补方法。
BMC Bioinformatics. 2007 May 3;8 Suppl 2(Suppl 2):S6. doi: 10.1186/1471-2105-8-S2-S6.
3
Ameliorative missing value imputation for robust biological knowledge inference.用于稳健生物学知识推理的改进型缺失值插补
J Biomed Inform. 2008 Aug;41(4):499-514. doi: 10.1016/j.jbi.2007.10.005. Epub 2007 Dec 31.
4
Iterative bicluster-based Bayesian principal component analysis and least squares for missing-value imputation in microarray and RNA-sequencing data.基于迭代双聚类的贝叶斯主成分分析和最小二乘法在微阵列和 RNA 测序数据中的缺失值插补。
Math Biosci Eng. 2022 Jun 16;19(9):8741-8759. doi: 10.3934/mbe.2022405.
5
DNA microarray data imputation and significance analysis of differential expression.DNA微阵列数据插补与差异表达的显著性分析
Bioinformatics. 2005 Nov 15;21(22):4155-61. doi: 10.1093/bioinformatics/bti638. Epub 2005 Aug 23.
6
Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data.并行缺失值插补:一种用于微阵列数据的新型稳健缺失值估计算法。
Bioinformatics. 2005 May 15;21(10):2417-23. doi: 10.1093/bioinformatics/bti345. Epub 2005 Feb 24.
7
Missing value imputation in DNA microarrays based on conjugate gradient method.基于共轭梯度法的 DNA 微阵列缺失值插补。
Comput Biol Med. 2012 Feb;42(2):222-7. doi: 10.1016/j.compbiomed.2011.11.011. Epub 2011 Dec 10.
8
A global learning with local preservation method for microarray data imputation.一种用于微阵列数据插补的全局学习与局部保留方法。
Comput Biol Med. 2016 Oct 1;77:76-89. doi: 10.1016/j.compbiomed.2016.08.005. Epub 2016 Aug 5.
9
A bicluster-based Bayesian principal component analysis method for microarray missing value estimation.一种基于双聚类的贝叶斯主成分分析方法用于微阵列缺失值估计。
IEEE J Biomed Health Inform. 2014 May;18(3):863-71. doi: 10.1109/JBHI.2013.2284795. Epub 2013 Oct 11.
10
Missing value imputation for microarray gene expression data using histone acetylation information.利用组蛋白乙酰化信息对微阵列基因表达数据进行缺失值插补
BMC Bioinformatics. 2008 May 29;9:252. doi: 10.1186/1471-2105-9-252.

引用本文的文献

1
Assembly structures of coastal woody species of eastern South America: Patterns and drivers.南美洲东部沿海木本植物的组装结构:模式与驱动因素。
Plant Divers. 2024 Apr 23;46(5):611-620. doi: 10.1016/j.pld.2024.04.006. eCollection 2024 Sep.
2
A comprehensive survey on computational learning methods for analysis of gene expression data.关于用于基因表达数据分析的计算学习方法的全面综述。
Front Mol Biosci. 2022 Nov 7;9:907150. doi: 10.3389/fmolb.2022.907150. eCollection 2022.
3
Optimization of Imputation Strategies for High-Resolution Gas Chromatography-Mass Spectrometry (HR GC-MS) Metabolomics Data.

本文引用的文献

1
A bicluster-based Bayesian principal component analysis method for microarray missing value estimation.一种基于双聚类的贝叶斯主成分分析方法用于微阵列缺失值估计。
IEEE J Biomed Health Inform. 2014 May;18(3):863-71. doi: 10.1109/JBHI.2013.2284795. Epub 2013 Oct 11.
2
Towards better accuracy for missing value estimation of epistatic miniarray profiling data by a novel ensemble approach.通过一种新的集成方法提高基于微阵列的上位性数据缺失值估计的准确性。
Genomics. 2011 May;97(5):257-64. doi: 10.1016/j.ygeno.2011.03.001. Epub 2011 Mar 21.
3
Missing value imputation for gene expression data: computational techniques to recover missing data from available information.
高分辨率气相色谱-质谱联用(HR GC-MS)代谢组学数据插补策略的优化
Metabolites. 2022 May 11;12(5):429. doi: 10.3390/metabo12050429.
4
Identification of Copy Number Alterations from Next-Generation Sequencing Data.从下一代测序数据中鉴定拷贝数改变。
Adv Exp Med Biol. 2022;1361:55-74. doi: 10.1007/978-3-030-91836-1_4.
5
Efficient technique of microarray missing data imputation using clustering and weighted nearest neighbour.基于聚类和加权最近邻的基因芯片缺失数据高效插补技术
Sci Rep. 2021 Dec 21;11(1):24297. doi: 10.1038/s41598-021-03438-x.
6
An efficient ensemble method for missing value imputation in microarray gene expression data.一种用于微阵列基因表达数据中缺失值插补的有效集成方法。
BMC Bioinformatics. 2021 Apr 13;22(1):188. doi: 10.1186/s12859-021-04109-4.
7
R Package imputeTestbench to Compare Imputation Methods for Univariate Time Series.用于比较单变量时间序列插补方法的R包imputeTestbench
R J. 2018;10(1):218-233.
8
Genomic Approaches to Posttraumatic Stress Disorder: The Psychiatric Genomic Consortium Initiative.创伤后应激障碍的基因组学方法:精神疾病基因组学联盟计划。
Biol Psychiatry. 2018 May 15;83(10):831-839. doi: 10.1016/j.biopsych.2018.01.020. Epub 2018 Feb 2.
9
MVIAeval: a web tool for comprehensively evaluating the performance of a new missing value imputation algorithm.MVIAeval:一个用于全面评估新的缺失值插补算法性能的网络工具。
BMC Bioinformatics. 2017 Jan 13;18(1):31. doi: 10.1186/s12859-016-1429-3.
基因表达数据的缺失值填补:从现有信息中恢复缺失数据的计算技术。
Brief Bioinform. 2011 Sep;12(5):498-513. doi: 10.1093/bib/bbq080. Epub 2010 Dec 14.
4
A weighted local least squares imputation method for missing value estimation in microarray gene expression data.一种用于微阵列基因表达数据中缺失值估计的加权局部最小二乘插补方法。
Int J Data Min Bioinform. 2010;4(3):331-47. doi: 10.1504/ijdmb.2010.033524.
5
Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments.比较缺失值插补方法以提高微阵列实验的聚类和解释。
BMC Genomics. 2010 Jan 7;11:15. doi: 10.1186/1471-2164-11-15.
6
Sequential local least squares imputation estimating missing value of microarray data.基于序列局部最小二乘法插补估计微阵列数据的缺失值
Comput Biol Med. 2008 Oct;38(10):1112-20. doi: 10.1016/j.compbiomed.2008.08.006. Epub 2008 Sep 30.
7
Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes.在表达谱中应使用哪种缺失值插补方法:一项比较研究及两种选择方案
BMC Bioinformatics. 2008 Jan 10;9:12. doi: 10.1186/1471-2105-9-12.
8
Iterated local least squares microarray missing value imputation.迭代局部最小二乘法微阵列缺失值插补
J Bioinform Comput Biol. 2006 Oct;4(5):935-57. doi: 10.1142/s0219720006002302.
9
Effects of replacing the unreliable cDNA microarray measurements on the disease classification based on gene expression profiles and functional modules.基于基因表达谱和功能模块,替换不可靠的cDNA微阵列测量值对疾病分类的影响。
Bioinformatics. 2006 Dec 1;22(23):2883-9. doi: 10.1093/bioinformatics/btl339. Epub 2006 Jun 29.
10
Microarray missing data imputation based on a set theoretic framework and biological knowledge.基于集合论框架和生物学知识的微阵列缺失数据插补
Nucleic Acids Res. 2006 Mar 20;34(5):1608-19. doi: 10.1093/nar/gkl047. Print 2006.