Research Group Systems Biology/Bioinformatics, Hans-Knölle-Institute, Jena, Germany.
Proteomics. 2010 Mar;10(6):1202-11. doi: 10.1002/pmic.200800576.
Gel-based proteomics is a widely applied technique to measure abundances of proteins in various biological systems. Comparison of two or more biological groups involves matching of 2-D gels. Depending on the software, this can result in spots showing missing values on several gels. Most studies ignore this fact or substitute all missing data by zero. Since a couple of years, scientists have realized that this is not the optimal way of analyzing their data and several studies were published presenting methods of imputing missing proteomics data. Most of these methods have already been applied to microarray data before; the phenomenon of missing data is well known in this field, too. With this review, we intend to further raise awareness of the problem of missing values in gel-based proteomics. We summarize reasons for missing values and explore their distribution in data sets. We also provide a comparison and evaluation of hitherto proposed imputation methods for gel-based proteomics data.
基于凝胶的蛋白质组学是一种广泛应用的技术,用于测量各种生物系统中蛋白质的丰度。比较两个或更多的生物组群需要匹配 2-D 凝胶。根据软件的不同,这可能会导致一些凝胶上的斑点出现缺失值。大多数研究忽略了这一事实,或者用零值替代所有缺失数据。自几年前以来,科学家们已经意识到这不是分析数据的最佳方法,并且已经发表了一些研究报告,提出了填补缺失蛋白质组学数据的方法。这些方法中的大多数之前已经应用于微阵列数据;在这个领域,缺失数据的现象也很常见。通过这篇综述,我们旨在进一步提高人们对基于凝胶的蛋白质组学中缺失值问题的认识。我们总结了缺失值的原因,并探讨了它们在数据集的分布。我们还对迄今为止提出的基于凝胶的蛋白质组学数据插补方法进行了比较和评估。