Luedi Philippe P, Hartemink Alexander J, Jirtle Randy L
Center for Bioinformatics and Computational Biology, Duke University, Durham, North Carolina 27708, USA.
Genome Res. 2005 Jun;15(6):875-84. doi: 10.1101/gr.3303505.
Imprinted genes are epigenetically modified genes whose expression is determined according to their parent of origin. They are involved in embryonic development, and imprinting dysregulation is linked to cancer, obesity, diabetes, and behavioral disorders such as autism and bipolar disease. Herein, we train a statistical model based on DNA sequence characteristics that not only identifies potentially imprinted genes, but also predicts the parental allele from which they are expressed. Of 23,788 annotated autosomal mouse genes, our model identifies 600 (2.5%) to be potentially imprinted, 64% of which are predicted to exhibit maternal expression. These predictions allowed for the identification of putative candidate genes for complex conditions where parent-of-origin effects are involved, including Alzheimer disease, autism, bipolar disorder, diabetes, male sexual orientation, obesity, and schizophrenia. We observe that the number, type, and relative orientation of repeated elements flanking a gene are particularly important in predicting whether a gene is imprinted.
印记基因是表观遗传修饰的基因,其表达根据其亲本来源而确定。它们参与胚胎发育,印记失调与癌症、肥胖症、糖尿病以及自闭症和双相情感障碍等行为障碍有关。在此,我们基于DNA序列特征训练了一个统计模型,该模型不仅可以识别潜在的印记基因,还可以预测它们表达的亲本等位基因。在23788个注释的常染色体小鼠基因中,我们的模型识别出600个(2.5%)可能是印记基因,其中64%预计表现出母源表达。这些预测有助于识别涉及亲本效应的复杂病症的假定候选基因,包括阿尔茨海默病、自闭症、双相情感障碍、糖尿病、男性性取向、肥胖症和精神分裂症。我们观察到,基因侧翼重复元件的数量、类型和相对方向在预测基因是否为印记基因方面尤为重要。