改进基于聚类的DNA微阵列数据缺失值估计

Improving cluster-based missing value estimation of DNA microarray data.

作者信息

Brás Lígia P, Menezes José C

机构信息

Centre for Chemical & Biological Engineering, Department of Chemical and Biological Engineering, IST, Technical University of Lisbon, Av. Rovisco Pais, P-1049-001 Lisbon, Portugal.

出版信息

Biomol Eng. 2007 Jun;24(2):273-82. doi: 10.1016/j.bioeng.2007.04.003. Epub 2007 Apr 19.

DOI:10.1016/j.bioeng.2007.04.003

PMID:17493870

Abstract

We present a modification of the weighted K-nearest neighbours imputation method (KNNimpute) for missing values (MVs) estimation in microarray data based on the reuse of estimated data. The method was called iterative KNN imputation (IKNNimpute) as the estimation is performed iteratively using the recently estimated values. The estimation efficiency of IKNNimpute was assessed under different conditions (data type, fraction and structure of missing data) by the normalized root mean squared error (NRMSE) and the correlation coefficients between estimated and true values, and compared with that of other cluster-based estimation methods (KNNimpute and sequential KNN). We further investigated the influence of imputation on the detection of differentially expressed genes using SAM by examining the differentially expressed genes that are lost after MV estimation. The performance measures give consistent results, indicating that the iterative procedure of IKNNimpute can enhance the prediction ability of cluster-based methods in the presence of high missing rates, in non-time series experiments and in data sets comprising both time series and non-time series data, because the information of the genes having MVs is used more efficiently and the iterative procedure allows refining the MV estimates. More importantly, IKNN has a smaller detrimental effect on the detection of differentially expressed genes.

摘要

我们基于估计数据的重用，提出了一种加权K近邻插补法（KNNimpute）的改进方法，用于估计微阵列数据中的缺失值（MVs）。该方法被称为迭代KNN插补法（IKNNimpute），因为估计是使用最近估计的值迭代进行的。通过归一化均方根误差（NRMSE）以及估计值与真实值之间的相关系数，在不同条件下（数据类型、缺失数据的比例和结构）评估IKNNimpute的估计效率，并与其他基于聚类的估计方法（KNNimpute和顺序KNN）进行比较。我们通过检查MV估计后丢失的差异表达基因，进一步研究了插补对使用SAM检测差异表达基因的影响。性能指标给出了一致的结果，表明在高缺失率情况下、在非时间序列实验中以及在包含时间序列和非时间序列数据的数据集中，IKNNimpute的迭代过程可以提高基于聚类方法的预测能力，因为具有MVs的基因信息得到了更有效的利用，并且迭代过程允许细化MV估计。更重要的是，IKNN对差异表达基因的检测具有较小的不利影响。

相似文献

Improving cluster-based missing value estimation of DNA microarray data.

Biomol Eng. 2007 Jun;24(2):273-82. doi: 10.1016/j.bioeng.2007.04.003. Epub 2007 Apr 19.

The influence of missing value imputation on detection of differentially expressed genes from microarray data.

Bioinformatics. 2005 Dec 1;21(23):4272-9. doi: 10.1093/bioinformatics/bti708. Epub 2005 Oct 10.

Robust imputation method for missing values in microarray data.

BMC Bioinformatics. 2007 May 3;8 Suppl 2(Suppl 2):S6. doi: 10.1186/1471-2105-8-S2-S6.

Collateral missing value imputation: a new robust missing value estimation algorithm for microarray data.

Bioinformatics. 2005 May 15;21(10):2417-23. doi: 10.1093/bioinformatics/bti345. Epub 2005 Feb 24.

DNA microarray data imputation and significance analysis of differential expression.

Bioinformatics. 2005 Nov 15;21(22):4155-61. doi: 10.1093/bioinformatics/bti638. Epub 2005 Aug 23.

Towards clustering of incomplete microarray data without the use of imputation.

Bioinformatics. 2007 Jan 1;23(1):107-13. doi: 10.1093/bioinformatics/btl555. Epub 2006 Oct 31.

Ameliorative missing value imputation for robust biological knowledge inference.

J Biomed Inform. 2008 Aug;41(4):499-514. doi: 10.1016/j.jbi.2007.10.005. Epub 2007 Dec 31.

Improving missing value imputation of microarray data by using spot quality weights.

BMC Bioinformatics. 2006 Jun 16;7:306. doi: 10.1186/1471-2105-7-306.

Autoregressive-model-based missing value estimation for DNA microarray time series data.

IEEE Trans Inf Technol Biomed. 2009 Jan;13(1):131-7. doi: 10.1109/TITB.2008.2007421.

A meta-data based method for DNA microarray imputation.

BMC Bioinformatics. 2007 Mar 29;8:109. doi: 10.1186/1471-2105-8-109.

引用本文的文献

Efficient technique of microarray missing data imputation using clustering and weighted nearest neighbour.

Sci Rep. 2021 Dec 21;11(1):24297. doi: 10.1038/s41598-021-03438-x.

An efficient ensemble method for missing value imputation in microarray gene expression data.

BMC Bioinformatics. 2021 Apr 13;22(1):188. doi: 10.1186/s12859-021-04109-4.

Identification of Key Genes and Pathways Associated With Irradiation in Breast Cancer Tissue and Breast Cancer Cell Lines.

Dose Response. 2020 Jun 18;18(2):1559325820931252. doi: 10.1177/1559325820931252. eCollection 2020 Apr-Jun.

Imputation of Gene Expression Data in Blood Cancer and Its Significance in Inferring Biological Pathways.

Front Oncol. 2020 Jan 8;9:1442. doi: 10.3389/fonc.2019.01442. eCollection 2019.

MVIAeval: a web tool for comprehensively evaluating the performance of a new missing value imputation algorithm.

BMC Bioinformatics. 2017 Jan 13;18(1):31. doi: 10.1186/s12859-016-1429-3.

An integrative imputation method based on multi-omics datasets.

BMC Bioinformatics. 2016 Jun 21;17:247. doi: 10.1186/s12859-016-1122-6.

Missing value imputation for microarray data: a comprehensive comparison study and a web tool.

BMC Syst Biol. 2013;7 Suppl 6(Suppl 6):S12. doi: 10.1186/1752-0509-7-S6-S12. Epub 2013 Dec 13.

Transcriptomic analyses during the transition from biomass production to lipid accumulation in the oleaginous yeast Yarrowia lipolytica.

PLoS One. 2011;6(11):e27966. doi: 10.1371/journal.pone.0027966. Epub 2011 Nov 22.

Comparative analysis of missing value imputation methods to improve clustering and interpretation of microarray experiments.

BMC Genomics. 2010 Jan 7;11:15. doi: 10.1186/1471-2164-11-15.

How to improve postgenomic knowledge discovery using imputation.

EURASIP J Bioinform Syst Biol. 2009;2009(1):717136. doi: 10.1155/2009/717136. Epub 2009 Feb 8.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

改进基于聚类的DNA微阵列数据缺失值估计

Improving cluster-based missing value estimation of DNA microarray data.

作者信息

Brás Lígia P, Menezes José C

机构信息

Centre for Chemical & Biological Engineering, Department of Chemical and Biological Engineering, IST, Technical University of Lisbon, Av. Rovisco Pais, P-1049-001 Lisbon, Portugal.

出版信息

Biomol Eng. 2007 Jun;24(2):273-82. doi: 10.1016/j.bioeng.2007.04.003. Epub 2007 Apr 19.

DOI:10.1016/j.bioeng.2007.04.003

PMID:17493870

Abstract

摘要

改进基于聚类的DNA微阵列数据缺失值估计

Improving cluster-based missing value estimation of DNA microarray data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

改进基于聚类的DNA微阵列数据缺失值估计

Improving cluster-based missing value estimation of DNA microarray data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献