Brief Bioinform. 2014 Nov;15(6):919-28. doi: 10.1093/bib/bbt053. Epub 2013 Aug 16.
Integrative analyses of genomic, epigenomic and transcriptomic features for human and various model organisms have revealed that many such features are nonrandomly distributed in the genome. Significant enrichment (or depletion) of genomic features is anticipated to be biologically important. Detection of genomic regions having enrichment of certain features and estimation of corresponding statistical significance rely on the expected null distribution generated by a permutation model. We discuss different genome-wide permutation approaches, present examples where the permutation strategy affects the null model and show that the confidence in estimating statistical significance of genome-wide enrichment might depend on the choice of the permutation approach. In those cases, where biologically relevant constraints are unclear, it is preferable to examine whether key conclusions are consistent, irrespective of the choice of the randomization strategy.
对人类和各种模式生物的基因组、表观基因组和转录组特征进行综合分析表明,许多此类特征在基因组中是非随机分布的。预期具有某些特征富集(或缺失)的基因组区域具有重要的生物学意义。检测具有特定特征富集的基因组区域并估计相应的统计显著性依赖于通过置换模型生成的预期零分布。我们讨论了不同的全基因组置换方法,展示了置换策略如何影响零模型的实例,并表明估计全基因组富集的统计显著性的置信度可能取决于置换方法的选择。在那些生物学相关约束不明确的情况下,最好检查关键结论是否一致,而不考虑随机化策略的选择。