Gleiss Andreas, Dakna Mohammed, Mischak Harald, Heinze Georg
Center for Medical Statistics, Informatics and Intelligent Systems, Medical University Vienna, Austria, Vienna, Austria and.
Mosaiques Diagnostics and Therapeutics AG, Hannover, Germany.
Bioinformatics. 2015 Jul 15;31(14):2310-7. doi: 10.1093/bioinformatics/btv154. Epub 2015 Mar 18.
A special characteristic of data from molecular biology is the frequent occurrence of zero intensity values which can arise either by true absence of a compound or by a signal that is below a technical limit of detection.
While so-called two-part tests compare mixture distributions between groups, one-part tests treat the zero-inflated distributions as left-censored. The left-inflated mixture model combines these two approaches. Both types of distributional assumptions and combinations of both are considered in a simulation study to compare power and estimation of log fold change. We discuss issues of application using an example from peptidomics.The considered tests generally perform best in scenarios satisfying their respective distributional assumptions. In the absence of distributional assumptions, the two-part Wilcoxon test or the empirical likelihood ratio test is recommended. Assuming a log-normal subdistribution the left-inflated mixture model provides estimates for the proportions of the two considered types of zero intensities.
R code is available at http://cemsiis.meduniwien.ac.at/en/kb/science-research/software/
分子生物学数据的一个特殊特征是经常出现零强度值,这可能是由于化合物真正不存在,也可能是由于信号低于技术检测限。
所谓的两部分检验比较组间的混合分布,而单部分检验将零膨胀分布视为左删失。左膨胀混合模型结合了这两种方法。在模拟研究中考虑了两种类型的分布假设以及两者的组合,以比较功效和对数倍数变化的估计。我们使用肽组学的一个例子讨论应用问题。所考虑的检验通常在满足各自分布假设的情况下表现最佳。在没有分布假设的情况下,建议使用两部分 Wilcoxon 检验或经验似然比检验。假设对数正态子分布,左膨胀混合模型可提供两种考虑类型的零强度比例的估计值。
R 代码可在 http://cemsiis.meduniwien.ac.at/en/kb/science-research/software/ 获取。