Reymond A, Friedli M, Henrichsen C N, Chapot F, Deutsch S, Ucla C, Rossier C, Lyle R, Guipponi M, Antonarakis S E
Division of Medical Genetics, University of Geneva Medical School, Geneva, 1211, Switzerland.
Genomics. 2001 Nov;78(1-2):46-54. doi: 10.1006/geno.2001.6640.
A supernumerary copy of human chromosome 21 (HC21) causes Down syndrome. To understand the molecular pathogenesis of Down syndrome, it is necessary to identify all HC21 genes. The first annotation of the sequence of 21q confirmed 127 genes, and predicted an additional 98 previously unknown "anonymous" genes (predictions (PREDs) and open reading frames (C21orfs)), which were foreseen by exon prediction programs and/or spliced expressed sequence tags. These putative gene models still need to be confirmed as bona fide transcripts. Here we report the characterization and expression pattern of the putative transcripts C21orf7, C21orf11, C21orf15, C21orf18, C21orf19, C21orf22, C21orf42, C21orf50, C21orf51, C21orf57, and C21orf58, the GC-rich sequence DNA-binding factor candidate GCFC (also known as C21orf66), PRED12, PRED31, PRED34, PRED44, PRED54, and PRED56. Our analysis showed that most of the C21orfs originally defined by matching spliced expressed sequence tags were correctly predicted, whereas many of the PREDs, defined solely by computer prediction, do not correspond to genuine genes. Four of the six PREDs were incorrectly predicted: PRED44 and C21orf11 are portions of the same transcript, PRED31 is a pseudogene, and PRED54 and PRED56 were wrongly predicted. In contrast, PRED12 (now called C21orf68) and PRED34 (C21orf63) are now confirmed transcripts. We identified three new genes, C21orf67, C21orf69, and C21orf70, not previously predicted by any programs. This revision of the HC21 transcriptome has consequences for the entire genome regarding the quality of previous annotations and the total number of transcripts. It also provides new candidates for genes involved in Down syndrome and other genetic disorders that map to HC21.
人类21号染色体(HC21)的额外拷贝会导致唐氏综合征。为了解唐氏综合征的分子发病机制,有必要鉴定出所有的HC21基因。对21号染色体长臂(21q)序列的首次注释确认了127个基因,并预测了另外98个先前未知的“匿名”基因(预测基因(PREDs)和开放阅读框(C21orfs)),这些基因是由外显子预测程序和/或剪接表达序列标签预测出来的。这些推测的基因模型仍需确认为真正的转录本。在此,我们报告了推测转录本C21orf7、C21orf11、C21orf15、C21orf18、C21orf19、C21orf22、C21orf42、C21orf50、C21orf51、C21orf57和C21orf58,富含GC序列的DNA结合因子候选基因GCFC(也称为C21orf66)、PRED12、PRED31、PRED34、PRED44、PRED54和PRED56的特征及表达模式。我们的分析表明,最初通过匹配剪接表达序列标签定义的大多数C21orfs预测正确,而许多仅通过计算机预测定义的PREDs并不对应真正的基因。六个PREDs中有四个预测错误:PRED44和C21orf11是同一转录本的部分,PRED31是假基因,PRED54和PRED56预测错误。相比之下,PRED12(现称为C21orf68)和PRED34(C21orf63)现在被确认为转录本。我们鉴定出了三个新基因,C21orf67、C21orf69和C21orf70,这些基因之前没有任何程序预测到。HC21转录组的这一修订对于整个基因组在先前注释的质量和转录本总数方面都有影响。它还为与唐氏综合征及其他定位于HC21的遗传疾病相关的基因提供了新的候选基因。