Chiang Charleston W K, Derti Adnan, Schwartz Daniel, Chou Michael F, Hirschhorn Joel N, Wu C-Ting
Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, Massachusetts 02115, USA.
Genetics. 2008 Dec;180(4):2277-93. doi: 10.1534/genetics.108.096537. Epub 2008 Oct 28.
Ultraconserved elements (UCEs) are sequences that are identical between reference genomes of distantly related species. As they are under negative selection and enriched near or in specific classes of genes, one explanation for their ultraconservation may be their involvement in important functions. Indeed, many UCEs can drive tissue-specific gene expression. We have demonstrated that nonexonic UCEs are depleted among segmental duplications (SDs) and copy number variants (CNVs) and proposed that their ultraconservation may reflect a mechanism of copy counting via comparison. Here, we report that nonexonic UCEs are also depleted among 10 of 11 recent genomewide data sets of human CNVs, including 3 obtained with strategies permitting greater precision in determining the extents of CNVs. We further present observations suggesting that nonexonic UCEs per se may contribute to this depletion and that their apparent dosage sensitivity was in effect when they became fixed in the last common ancestor of mammals, birds, and reptiles, consistent with dosage sensitivity contributing to ultraconservation. Finally, in searching for the mechanism(s) underlying the function of nonexonic UCEs, we have found that they are enriched in TAATTA, which is also the recognition sequence for the homeodomain DNA-binding module, and bounded by a change in A + T frequency.
超保守元件(UCEs)是在远缘物种的参考基因组之间相同的序列。由于它们处于负选择之下且在特定类别的基因附近或内部富集,其超保守性的一种解释可能是它们参与了重要功能。实际上,许多超保守元件可以驱动组织特异性基因表达。我们已经证明非外显子超保守元件在片段重复(SDs)和拷贝数变异(CNVs)中缺失,并提出它们的超保守性可能反映了一种通过比较进行拷贝计数的机制。在此,我们报告在11个近期人类拷贝数变异的全基因组数据集中,有10个数据集的非外显子超保守元件也缺失,其中包括3个通过能够更精确确定拷贝数变异范围的策略获得的数据集。我们进一步展示的观察结果表明,非外显子超保守元件本身可能导致了这种缺失,并且当它们在哺乳动物、鸟类和爬行动物的最后共同祖先中固定下来时,其明显的剂量敏感性就起作用了,这与剂量敏感性促成超保守性一致。最后,在寻找非外显子超保守元件功能背后的机制时,我们发现它们在TAATTA中富集,TAATTA也是同源结构域DNA结合模块的识别序列,并且由A + T频率的变化界定。