Torrents David, Suyama Mikita, Zdobnov Evgeny, Bork Peer
EMBL, Heidelberg 69117, Germany.
Genome Res. 2003 Dec;13(12):2559-67. doi: 10.1101/gr.1455503.
We screened all intergenic regions in the human genome to identify pseudogenes with a combination of homology searches and a functionality test using the ratio of silent to replacement nucleotide substitutions (KA/KS). We identified 19,724 regions of which 95% +/- 3% are estimated to evolve neutrally and thus are likely to encode pseudogenes. Half of these have no detectable truncation in their pseudocoding regions and therefore are not identifiable by methods that require the presence of truncations to prove nonfunctionality. A comparative analysis with the mouse genome showed that 70% of these pseudogenes have a retrotranspositional origin (processed), and the rest arose by segmental duplication (nonprocessed). Although the spread of both types of pseudogenes correlates with chromosome size, nonprocessed pseudogenes appear to be enriched in regions with high gene density. It is likely that the human pseudogenes identified here represent only a small fraction of the total, which probably exceeds the number of genes.
我们筛选了人类基因组中的所有基因间区域,通过同源性搜索和使用沉默核苷酸替换与置换核苷酸替换的比率(KA/KS)进行功能测试相结合的方法来鉴定假基因。我们识别出19724个区域,其中95%±3%估计以中性方式进化,因此可能编码假基因。其中一半在其假编码区域没有可检测到的截断,因此无法通过需要存在截断来证明无功能的方法识别。与小鼠基因组的比较分析表明,这些假基因中有70%具有反转录转座起源(加工型),其余的则通过片段重复产生(非加工型)。尽管这两种类型的假基因的分布都与染色体大小相关,但非加工型假基因似乎在基因密度高的区域富集。这里鉴定出的人类假基因可能仅占总数的一小部分,总数可能超过基因数量。