Institute for Human Genetics, University of California, San Francisco, San Francisco, California 94143, USA.
Genome Res. 2010 Mar;20(3):311-9. doi: 10.1101/gr.094151.109. Epub 2009 Dec 23.
Noncoding DNA, particularly intronic DNA, harbors important functional elements that affect gene expression and RNA splicing. Yet, it is unclear which specific noncoding sites are essential for gene function and regulation. To identify functional elements in noncoding DNA, we characterized genetic variation within introns using ethnically diverse human polymorphism data from three public databases-PMT, NIEHS, and SeattleSNPs. We demonstrate that positions within introns corresponding to known functional elements involved in pre-mRNA splicing, including the branch site, splice sites, and polypyrimidine tract show reduced levels of genetic variation. Additionally, we observed regions of reduced genetic variation that are candidates for distance-dependent localization sites of functional elements, possibly intronic splicing enhancers (ISEs). Using several bioinformatics approaches, we provide additional evidence that supports our hypotheses that these regions correspond to ISEs. We conclude that studies of genetic variation can successfully discriminate and identify functional elements in noncoding regions. As more noncoding sequence data become available, the methods employed here can be utilized to identify additional functional elements in the human genome and provide possible explanations for phenotypic associations.
非编码 DNA,特别是内含子 DNA,蕴藏着影响基因表达和 RNA 剪接的重要功能元件。然而,目前尚不清楚哪些特定的非编码位点对于基因功能和调控是必不可少的。为了鉴定非编码 DNA 中的功能元件,我们利用来自三个公共数据库-PMT、NIEHS 和 SeattleSNPs 的种族多样化人类多态性数据,对内含子中的遗传变异进行了特征描述。我们证明,与涉及前体 mRNA 剪接的已知功能元件(包括分支点、剪接位点和多嘧啶 tract)相对应的内含子位置的遗传变异水平降低。此外,我们还观察到遗传变异减少的区域,这些区域可能是功能元件(如内含子剪接增强子,ISEs)的距离依赖定位位点的候选区域。我们使用了几种生物信息学方法,提供了额外的证据,支持我们的假设,即这些区域对应于 ISEs。我们的结论是,遗传变异的研究可以成功地区分和识别非编码区域中的功能元件。随着越来越多的非编码序列数据的出现,这里采用的方法可以用于鉴定人类基因组中的其他功能元件,并为表型关联提供可能的解释。