Nguyen Duc-Quang, Webber Caleb, Ponting Chris P
MRC Functional Genetics Unit, Department of Human Anatomy and Genetics, University of Oxford, Oxford, United Kingdom.
PLoS Genet. 2006 Feb;2(2):e20. doi: 10.1371/journal.pgen.0020020. Epub 2006 Feb 17.
Although large-scale copy-number variation is an important contributor to conspecific genomic diversity, whether these variants frequently contribute to human phenotype differences remains unknown. If they have few functional consequences, then copy-number variants (CNVs) might be expected both to be distributed uniformly throughout the human genome and to encode genes that are characteristic of the genome as a whole. We find that human CNVs are significantly overrepresented close to telomeres and centromeres and in simple tandem repeat sequences. Additionally, human CNVs were observed to be unusually enriched in those protein-coding genes that have experienced significantly elevated synonymous and nonsynonymous nucleotide substitution rates, estimated between single human and mouse orthologues. CNV genes encode disproportionately large numbers of secreted, olfactory, and immunity proteins, although they contain fewer than expected genes associated with Mendelian disease. Despite mouse CNVs also exhibiting a significant elevation in synonymous substitution rates, in most other respects they do not differ significantly from the genomic background. Nevertheless, they encode proteins that are depleted in olfactory function, and they exhibit significantly decreased amino acid sequence divergence. Natural selection appears to have acted discriminately among human CNV genes. The significant overabundance, within human CNVs, of genes associated with olfaction, immunity, protein secretion, and elevated coding sequence divergence, indicates that a subset may have been retained in the human population due to the adaptive benefit of increased gene dosage. By contrast, the functional characteristics of mouse CNVs either suggest that advantageous gene copies have been depleted during recent selective breeding of laboratory mouse strains or suggest that they were preferentially fixed as a consequence of the larger effective population size of wild mice. It thus appears that CNV differences among mouse strains do not provide an appropriate model for large-scale sequence variations in the human population.
尽管大规模拷贝数变异是同种基因组多样性的重要贡献因素,但这些变异是否经常导致人类表型差异仍不清楚。如果它们几乎没有功能后果,那么拷贝数变异(CNV)可能既会均匀分布在整个人类基因组中,又会编码整个基因组特有的基因。我们发现,人类CNV在靠近端粒和着丝粒的区域以及简单串联重复序列中显著富集。此外,在那些单个人类与小鼠直系同源基因间估计经历了同义及非同义核苷酸替换率显著升高的蛋白质编码基因中,观察到人类CNV异常富集。CNV基因编码数量不成比例的大量分泌蛋白、嗅觉蛋白和免疫蛋白,尽管它们所含与孟德尔疾病相关的基因比预期的少。尽管小鼠CNV在同义替换率方面也显著升高,但在大多数其他方面,它们与基因组背景没有显著差异。然而,它们编码的蛋白质嗅觉功能有所缺失,并且氨基酸序列差异显著降低。自然选择似乎在人类CNV基因中进行了有区别的作用。人类CNV中与嗅觉、免疫、蛋白质分泌以及编码序列差异增加相关的基因显著过量,这表明由于基因剂量增加带来的适应性益处,一部分CNV可能在人类群体中得以保留。相比之下,小鼠CNV的功能特征要么表明在实验室小鼠品系的近期选择性育种过程中有利的基因拷贝已被耗尽,要么表明它们因野生小鼠较大的有效种群规模而被优先固定下来。因此,小鼠品系间的CNV差异似乎并不能为人类群体中的大规模序列变异提供合适的模型。