Zhang Fang, Shan Ang, Luan Yihui
School of Mathematics, Shandong University, Jinan, 250100, P.R. China.
Stat Appl Genet Mol Biol. 2018 Nov 17;17(6):/j/sagmb.2018.17.issue-6/sagmb-2018-0019/sagmb-2018-0019.xml. doi: 10.1515/sagmb-2018-0019.
In recent years, a large number of time series microbial community data has been produced in molecular biological studies, especially in metagenomics. Among the statistical methods for time series, local similarity analysis is used in a wide range of environments to capture potential local and time-shifted associations that cannot be distinguished by traditional correlation analysis. Initially, the permutation test is popularly applied to obtain the statistical significance of local similarity analysis. More recently, a theoretical method has also been developed to achieve this aim. However, all these methods require the assumption that the time series are independent and identically distributed. In this paper, we propose a new approach based on moving block bootstrap to approximate the statistical significance of local similarity scores for dependent time series. Simulations show that our method can control the type I error rate reasonably, while theoretical approximation and the permutation test perform less well. Finally, our method is applied to human and marine microbial community datasets, indicating that it can identify potential relationship among operational taxonomic units (OTUs) and significantly decrease the rate of false positives.
近年来,分子生物学研究,尤其是宏基因组学研究产生了大量的时间序列微生物群落数据。在时间序列的统计方法中,局部相似性分析在广泛的环境中被用于捕捉传统相关性分析无法区分的潜在局部和时移关联。最初,置换检验被广泛应用于获得局部相似性分析的统计显著性。最近,也开发了一种理论方法来实现这一目标。然而,所有这些方法都需要假设时间序列是独立同分布的。在本文中,我们提出了一种基于移动块自助法的新方法,用于近似相关时间序列局部相似性得分的统计显著性。模拟结果表明,我们的方法能够合理地控制I型错误率,而理论近似法和置换检验的效果较差。最后,我们的方法被应用于人类和海洋微生物群落数据集,表明它能够识别操作分类单元(OTU)之间的潜在关系,并显著降低假阳性率。