零膨胀强度值的两组比较：检验统计量的选择很重要。

Two-group comparisons of zero-inflated intensity values: the choice of test statistic matters.

作者信息

Gleiss Andreas, Dakna Mohammed, Mischak Harald, Heinze Georg

机构信息

Center for Medical Statistics, Informatics and Intelligent Systems, Medical University Vienna, Austria, Vienna, Austria and.

Mosaiques Diagnostics and Therapeutics AG, Hannover, Germany.

出版信息

Bioinformatics. 2015 Jul 15;31(14):2310-7. doi: 10.1093/bioinformatics/btv154. Epub 2015 Mar 18.

DOI:10.1093/bioinformatics/btv154

PMID:25788623

Abstract

MOTIVATION

A special characteristic of data from molecular biology is the frequent occurrence of zero intensity values which can arise either by true absence of a compound or by a signal that is below a technical limit of detection.

RESULTS

While so-called two-part tests compare mixture distributions between groups, one-part tests treat the zero-inflated distributions as left-censored. The left-inflated mixture model combines these two approaches. Both types of distributional assumptions and combinations of both are considered in a simulation study to compare power and estimation of log fold change. We discuss issues of application using an example from peptidomics.The considered tests generally perform best in scenarios satisfying their respective distributional assumptions. In the absence of distributional assumptions, the two-part Wilcoxon test or the empirical likelihood ratio test is recommended. Assuming a log-normal subdistribution the left-inflated mixture model provides estimates for the proportions of the two considered types of zero intensities.

AVAILABILITY

R code is available at http://cemsiis.meduniwien.ac.at/en/kb/science-research/software/

摘要

动机

分子生物学数据的一个特殊特征是经常出现零强度值，这可能是由于化合物真正不存在，也可能是由于信号低于技术检测限。

结果

所谓的两部分检验比较组间的混合分布，而单部分检验将零膨胀分布视为左删失。左膨胀混合模型结合了这两种方法。在模拟研究中考虑了两种类型的分布假设以及两者的组合，以比较功效和对数倍数变化的估计。我们使用肽组学的一个例子讨论应用问题。所考虑的检验通常在满足各自分布假设的情况下表现最佳。在没有分布假设的情况下，建议使用两部分 Wilcoxon 检验或经验似然比检验。假设对数正态子分布，左膨胀混合模型可提供两种考虑类型的零强度比例的估计值。