微阵列数据的综合缺失值估计

Integrative missing value estimation for microarray data.

作者信息

Hu Jianjun, Li Haifeng, Waterman Michael S, Zhou Xianghong Jasmine

机构信息

Molecular and Computational Biology Section, Department of Biological Sciences, University of Southern California, Los Angeles, CA 900089, USA.

出版信息

BMC Bioinformatics. 2006 Oct 12;7:449. doi: 10.1186/1471-2105-7-449.

DOI:10.1186/1471-2105-7-449

PMID:17038176

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1622759/

Abstract

BACKGROUND

Missing value estimation is an important preprocessing step in microarray analysis. Although several methods have been developed to solve this problem, their performance is unsatisfactory for datasets with high rates of missing data, high measurement noise, or limited numbers of samples. In fact, more than 80% of the time-series datasets in Stanford Microarray Database contain less than eight samples.

RESULTS

We present the integrative Missing Value Estimation method (iMISS) by incorporating information from multiple reference microarray datasets to improve missing value estimation. For each gene with missing data, we derive a consistent neighbor-gene list by taking reference data sets into consideration. To determine whether the given reference data sets are sufficiently informative for integration, we use a submatrix imputation approach. Our experiments showed that iMISS can significantly and consistently improve the accuracy of the state-of-the-art Local Least Square (LLS) imputation algorithm by up to 15% improvement in our benchmark tests.

CONCLUSION

We demonstrated that the order-statistics-based integrative imputation algorithms can achieve significant improvements over the state-of-the-art missing value estimation approaches such as LLS and is especially good for imputing microarray datasets with a limited number of samples, high rates of missing data, or very noisy measurements. With the rapid accumulation of microarray datasets, the performance of our approach can be further improved by incorporating larger and more appropriate reference datasets.

摘要

背景

缺失值估计是微阵列分析中的一个重要预处理步骤。尽管已经开发了几种方法来解决这个问题，但对于具有高缺失数据率、高测量噪声或有限样本数量的数据集，它们的性能并不理想。事实上，斯坦福微阵列数据库中超过80%的时间序列数据集包含少于8个样本。

结果

我们通过整合来自多个参考微阵列数据集的信息来提出综合缺失值估计算法（iMISS），以改进缺失值估计。对于每个有缺失数据的基因，我们通过考虑参考数据集来推导一致的相邻基因列表。为了确定给定的参考数据集是否具有足够的信息用于整合，我们使用子矩阵插补方法。我们的实验表明，在我们的基准测试中，iMISS可以显著且持续地将最先进的局部最小二乘（LLS）插补算法的准确性提高多达15%。

结论

我们证明了基于顺序统计的综合插补算法相对于诸如LLS等最先进的缺失值估计方法可以实现显著改进，并且对于插补具有有限样本数量、高缺失数据率或非常嘈杂测量的微阵列数据集特别有效。随着微阵列数据集的快速积累，通过纳入更大且更合适的参考数据集，我们方法的性能可以进一步提高。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1ab8/1622759/e99826eedda3/1471-2105-7-449-2.jpg

相似文献

Integrative missing value estimation for microarray data.微阵列数据的综合缺失值估计

BMC Bioinformatics. 2006 Oct 12;7:449. doi: 10.1186/1471-2105-7-449.

Improving missing value estimation in microarray data with gene ontology.利用基因本体论改进微阵列数据中的缺失值估计

Bioinformatics. 2006 Mar 1;22(5):566-72. doi: 10.1093/bioinformatics/btk019. Epub 2005 Dec 23.

Iterated local least squares microarray missing value imputation.迭代局部最小二乘法微阵列缺失值插补

J Bioinform Comput Biol. 2006 Oct;4(5):935-57. doi: 10.1142/s0219720006002302.

Robust imputation method for missing values in microarray data.微阵列数据中缺失值的稳健插补方法。

BMC Bioinformatics. 2007 May 3;8 Suppl 2(Suppl 2):S6. doi: 10.1186/1471-2105-8-S2-S6.

Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data.并行缺失值插补：一种用于微阵列数据的新型稳健缺失值估计算法。

Bioinformatics. 2005 May 15;21(10):2417-23. doi: 10.1093/bioinformatics/bti345. Epub 2005 Feb 24.

A hybrid imputation approach for microarray missing value estimation.一种用于微阵列缺失值估计的混合插补方法。

BMC Genomics. 2015;16 Suppl 9(Suppl 9):S1. doi: 10.1186/1471-2164-16-S9-S1. Epub 2015 Aug 17.

A meta-data based method for DNA microarray imputation.一种基于元数据的DNA微阵列插补方法。

BMC Bioinformatics. 2007 Mar 29;8:109. doi: 10.1186/1471-2105-8-109.

Missing value imputation for microarray gene expression data using histone acetylation information.利用组蛋白乙酰化信息对微阵列基因表达数据进行缺失值插补

BMC Bioinformatics. 2008 May 29;9:252. doi: 10.1186/1471-2105-9-252.

Which missing value imputation method to use in expression profiles: a comparative study and two selection schemes.在表达谱中应使用哪种缺失值插补方法：一项比较研究及两种选择方案

BMC Bioinformatics. 2008 Jan 10;9:12. doi: 10.1186/1471-2105-9-12.

Missing value imputation improves clustering and interpretation of gene expression microarray data.缺失值插补可改善基因表达微阵列数据的聚类和解读。

BMC Bioinformatics. 2008 Apr 18;9:202. doi: 10.1186/1471-2105-9-202.

引用本文的文献

A comprehensive survey on computational learning methods for analysis of gene expression data.关于用于基因表达数据分析的计算学习方法的全面综述。

Front Mol Biosci. 2022 Nov 7;9:907150. doi: 10.3389/fmolb.2022.907150. eCollection 2022.

An efficient ensemble method for missing value imputation in microarray gene expression data.一种用于微阵列基因表达数据中缺失值插补的有效集成方法。

BMC Bioinformatics. 2021 Apr 13;22(1):188. doi: 10.1186/s12859-021-04109-4.

A flexible, interpretable, and accurate approach for imputing the expression of unmeasured genes.一种灵活、可解释且准确的方法，用于推断未测量基因的表达。

Nucleic Acids Res. 2020 Dec 2;48(21):e125. doi: 10.1093/nar/gkaa881.

Imputation of Gene Expression Data in Blood Cancer and Its Significance in Inferring Biological Pathways.血液癌症中基因表达数据的插补及其在推断生物学途径中的意义。

Front Oncol. 2020 Jan 8;9:1442. doi: 10.3389/fonc.2019.01442. eCollection 2019.

Kalman Filtering for Genetic Regulatory Networks with Missing Values.用于处理缺失值的基因调控网络的卡尔曼滤波

Comput Math Methods Med. 2017;2017:7837109. doi: 10.1155/2017/7837109. Epub 2017 Jul 26.

An integrative imputation method based on multi-omics datasets.一种基于多组学数据集的综合插补方法。

BMC Bioinformatics. 2016 Jun 21;17:247. doi: 10.1186/s12859-016-1122-6.

Missing value imputation for microarray data: a comprehensive comparison study and a web tool.微阵列数据的缺失值插补：一项综合比较研究及网络工具

BMC Syst Biol. 2013;7 Suppl 6(Suppl 6):S12. doi: 10.1186/1752-0509-7-S6-S12. Epub 2013 Dec 13.

Combined statistical analyses of peptide intensities and peptide occurrences improves identification of significant peptides from MS-based proteomics data.联合肽强度和肽出现的统计分析可提高基于 MS 的蛋白质组学数据中显著肽的鉴定。

J Proteome Res. 2010 Nov 5;9(11):5748-56. doi: 10.1021/pr1005247. Epub 2010 Oct 8.

Impact of missing value imputation on classification for DNA microarray gene expression data--a model-based study.缺失值插补对DNA微阵列基因表达数据分类的影响——一项基于模型的研究。

EURASIP J Bioinform Syst Biol. 2009;2009(1):504069. doi: 10.1155/2009/504069. Epub 2010 Mar 2.

Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments.比较缺失值插补方法以提高微阵列实验的聚类和解释。

BMC Genomics. 2010 Jan 7;11:15. doi: 10.1186/1471-2164-11-15.

本文引用的文献

Microarray technology: beyond transcript profiling and genotype analysis.微阵列技术：超越转录谱分析和基因分型分析

Nat Rev Genet. 2006 Mar;7(3):200-10. doi: 10.1038/nrg1809.

Missing value estimation for DNA microarray gene expression data by Support Vector Regression imputation and orthogonal coding scheme.基于支持向量回归插补和正交编码方案的DNA微阵列基因表达数据缺失值估计

BMC Bioinformatics. 2006 Jan 22;7:32. doi: 10.1186/1471-2105-7-32.

Improving missing value estimation in microarray data with gene ontology.利用基因本体论改进微阵列数据中的缺失值估计

Bioinformatics. 2006 Mar 1;22(5):566-72. doi: 10.1093/bioinformatics/btk019. Epub 2005 Dec 23.

DNA microarray data imputation and significance analysis of differential expression.DNA微阵列数据插补与差异表达的显著性分析

Bioinformatics. 2005 Nov 15;21(22):4155-61. doi: 10.1093/bioinformatics/bti638. Epub 2005 Aug 23.

Non-linear PCA: a missing data approach.非线性主成分分析：一种缺失数据方法。

Bioinformatics. 2005 Oct 15;21(20):3887-95. doi: 10.1093/bioinformatics/bti634. Epub 2005 Aug 18.

Mining for regulatory programs in the cancer transcriptome.挖掘癌症转录组中的调控程序。

Nat Genet. 2005 Jun;37(6):579-83. doi: 10.1038/ng1578.

Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data.并行缺失值插补：一种用于微阵列数据的新型稳健缺失值估计算法。

Bioinformatics. 2005 May 15;21(10):2417-23. doi: 10.1093/bioinformatics/bti345. Epub 2005 Feb 24.

Functional annotation and network reconstruction through cross-platform integration of microarray data.通过微阵列数据的跨平台整合进行功能注释和网络重建。

Nat Biotechnol. 2005 Feb;23(2):238-43. doi: 10.1038/nbt1058. Epub 2005 Jan 16.

Missing value estimation for DNA microarray gene expression data: local least squares imputation.DNA微阵列基因表达数据的缺失值估计：局部最小二乘插补法

Bioinformatics. 2005 Jan 15;21(2):187-98. doi: 10.1093/bioinformatics/bth499. Epub 2004 Aug 27.

Influence of microarrays experiments missing values on the stability of gene groups by hierarchical clustering.微阵列实验缺失值对通过层次聚类的基因组稳定性的影响。

BMC Bioinformatics. 2004 Aug 23;5:114. doi: 10.1186/1471-2105-5-114.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

微阵列数据的综合缺失值估计

Integrative missing value estimation for microarray data.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献