Graduate Program in Quantitative and Computational Biosciences, Baylor College of Medicine, Houston, TX, 77030, USA.
Jan and Dan Duncan Neurological Research Institute at Texas Children's Hospital, Houston, TX, 77030, USA.
Nat Commun. 2018 Aug 13;9(1):3225. doi: 10.1038/s41467-018-05627-1.
Recent studies have suggested that genes longer than 100 kb are more likely to be misregulated in neurological diseases associated with synaptic dysfunction, such as autism and Rett syndrome. These length-dependent transcriptional changes are modest in MeCP2-mutant samples, but, given the low sensitivity of high-throughput transcriptome profiling technology, here we re-evaluate the statistical significance of these results. We find that the apparent length-dependent trends previously observed in MeCP2 microarray and RNA-sequencing datasets disappear after estimating baseline variability from randomized control samples. This is particularly true for genes with low fold changes. We find no bias with NanoString technology, so this long gene bias seems to be particular to polymerase chain reaction amplification-based platforms. In contrast, authentic long gene effects, such as those caused by topoisomerase inhibition, can be detected even after adjustment for baseline variability. We conclude that accurate characterization of length-dependent (or other) trends requires establishing a baseline from randomized control samples.
最近的研究表明,在与突触功能障碍相关的神经疾病(如自闭症和雷特综合征)中,超过 100kb 的基因更有可能被错误调控。这些长度依赖性的转录变化在 MeCP2 突变样本中是适度的,但由于高通量转录组分析技术的灵敏度较低,我们在这里重新评估了这些结果的统计学意义。我们发现,在 MeCP2 微阵列和 RNA 测序数据集之前观察到的明显的长度依赖性趋势,在从随机对照样本中估计基线变异性后消失了。对于 fold change 较低的基因尤其如此。我们在 NanoString 技术中没有发现偏差,因此这种长基因偏差似乎是特定于基于聚合酶链反应扩增的平台。相比之下,即使在调整了基线变异性后,真正的长基因效应(如拓扑异构酶抑制引起的效应)仍然可以被检测到。我们得出的结论是,准确描述长度依赖性(或其他)趋势需要从随机对照样本中建立基线。