Kaessmann Henrik, Zöllner Sebastian, Nekrutenko Anton, Li Wen-Hsiung
Department of Ecology and Evolution, University of Chicago, Chicago, Illinois 60637, USA.
Genome Res. 2002 Nov;12(11):1642-50. doi: 10.1101/gr.520702.
To elucidate the role of exon shuffling in shaping the complexity of the human genome/proteome, we have systematically analyzed intron phase distributions in the coding sequence of human protein domains. We found that introns at the boundaries of domains show high excess of symmetrical phase combinations (i.e., 0-0, 1-1, and 2-2), whereas nonboundary introns show no excess symmetry. This suggests that exon shuffling has primarily involved rearrangement of structural and functional domains as a whole. Furthermore, we found that domains flanked by phase 1 introns have dramatically expanded in the human genome due to domain shuffling and that 1-1 symmetrical domains and domain families are nonrandomly distributed with respect to their age. The predominance and extracellular location of 1-1 symmetrical domains among domains specific to metazoans suggests that they are associated with the rise of multicellularity. On the other hand, 0-0 symmetrical domains tend to be over-represented among ancient protein domains that are shared between the eukaryotic and prokaryotic kingdoms, which is compatible with the suggestion of primordial domain shuffling in the progenote. To see whether the human data reflect general genomic patterns of metazoans, similar analyses were done for the nematode Caenorhabditis elegans. Although the C. elegans data generally concur with the human patterns, we identified fewer intron-bounded domains in this organism, consistent with the lower complexity of C. elegans genes. [The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: Z. Gu and R. Stevens.]
为阐明外显子重排对塑造人类基因组/蛋白质组复杂性的作用,我们系统分析了人类蛋白质结构域编码序列中的内含子相位分布。我们发现,结构域边界处的内含子呈现出高度过量的对称相位组合(即0-0、1-1和2-2),而非边界内含子则没有过量的对称性。这表明外显子重排主要涉及整个结构和功能结构域的重排。此外,我们发现,由于结构域重排,在人类基因组中,由1相位内含子侧翼的结构域显著扩张,并且1-1对称结构域和结构域家族在其年龄方面呈非随机分布。后生动物特有的结构域中1-1对称结构域的优势地位及其细胞外定位表明,它们与多细胞性的出现有关。另一方面,0-0对称结构域在真核生物和原核生物共有的古老蛋白质结构域中往往过度代表,这与原祖中原始结构域重排的观点一致。为了探究人类数据是否反映后生动物的一般基因组模式,我们对线虫秀丽隐杆线虫进行了类似分析。尽管秀丽隐杆线虫的数据总体上与人类模式一致,但我们在该生物体中鉴定出的内含子界定结构域较少,这与秀丽隐杆线虫基因较低的复杂性相符。[以下人员按照论文所示提供了试剂、样本或未发表的信息:Z. Gu和R. Stevens。]