Department of Molecular Genetics, University of Toronto, Toronto, Canada.
PLoS Genet. 2012;8(3):e1002600. doi: 10.1371/journal.pgen.1002600. Epub 2012 Mar 29.
In contrast to existing estimates of approximately 200 murine imprinted genes, recent work based on transcriptome sequencing uncovered parent-of-origin allelic effects at more than 1,300 loci in the developing brain and two adult brain regions, including hundreds present in only males or females. Our independent replication of the embryonic brain stage, where the majority of novel imprinted genes were discovered and the majority of previously known imprinted genes confirmed, resulted in only 12.9% concordance among the novel imprinted loci. Further analysis and pyrosequencing-based validation revealed that the vast majority of the novel reported imprinted loci are false-positives explained by technical and biological variation of the experimental approach. We show that allele-specific expression (ASE) measured with RNA-Seq is not accurately modeled with statistical methods that assume random independent sampling and that systematic error must be accounted for to enable accurate identification of imprinted expression. Application of a robust approach that accounts for these effects revealed 50 candidate genes where allelic bias was predicted to be parent-of-origin-dependent. However, 11 independent validation attempts through a range of allelic expression biases confirmed only 6 of these novel cases. The results emphasize the importance of independent validation and suggest that the number of imprinted genes is much closer to the initial estimates.
与现有的约 200 个鼠类印记基因的估计数形成对比的是,最近基于转录组测序的研究在发育中的大脑和两个成年大脑区域的 1300 多个基因座上发现了亲本来源等位基因效应,包括仅在雄性或雌性中存在的数百个基因座。我们对胚胎大脑阶段的独立重复,在该阶段发现了大多数新的印记基因,并确认了大多数以前已知的印记基因,结果表明在新发现的印记基因座中仅有 12.9%是一致的。进一步的分析和基于焦磷酸测序的验证表明,大多数新报告的印记基因座都是假阳性,这是由实验方法的技术和生物学变异引起的。我们表明,使用 RNA-Seq 测量的等位基因特异性表达 (ASE) 不能用假设随机独立采样的统计方法准确建模,必须考虑系统误差才能准确识别印记表达。应用一种能够解释这些影响的稳健方法,预测了 50 个候选基因,其中等位基因偏倚被预测为亲本来源依赖性。然而,通过一系列等位基因表达偏倚进行的 11 次独立验证尝试仅确认了其中的 6 个新案例。结果强调了独立验证的重要性,并表明印记基因的数量更接近最初的估计。