Burn Research, Shriners Hospitals for Children Northern California and Department of Surgery, University of California-Davis, 2425 Stockton Blvd., Sacramento, CA 95817, USA.
Exp Mol Pathol. 2011 Jun;90(3):300-11. doi: 10.1016/j.yexmp.2011.02.007. Epub 2011 Mar 1.
Approximately 2% of the human genome is reported to be occupied by genes. Various forms of repetitive elements (REs), both characterized and uncharacterized, are presumed to make up the vast majority of the rest of the genomes of human and other species. In conjunction with a comprehensive annotation of genes, information regarding components of genome biology, such as gene polymorphisms, non-coding RNAs, and certain REs, is found in human genome databases. However, the genome-wide profile of unique RE arrangements formed by different groups of REs has not been fully characterized yet. In this study, the entire human genome was subjected to an unbiased RE survey to establish a whole-genome profile of REs and their arrangements. Due to the limitation in query size within the bl2seq alignment program (National Center for Biotechnology Information [NCBI]) utilized for the RE survey, the entire NCBI reference human genome was fragmented into 6206 units of 0.5M nucleotides. A number of RE arrangements with varying complexities and patterns were identified throughout the genome. Each chromosome had unique profiles of RE arrangements and density, and high levels of RE density were measured near the centromere regions. Subsequently, 175 complex RE arrangements, which were selected throughout the genome, were subjected to a comparison analysis using five different human genome sequences. Interestingly, three of the five human genome databases shared the exactly same arrangement patterns and sequences for all 175 RE arrangement regions (a total of 12,765,625 nucleotides). The findings from this study demonstrate that a substantial fraction of REs in the human genome are clustered into various forms of ordered structures. Further investigations are needed to examine whether some of these ordered RE arrangements contribute to the human pathobiology as a functional genome unit.
据报道,人类基因组的大约 2%被基因占据。各种形式的重复元件(REs),包括已被描述和未被描述的,被认为构成了人类和其他物种基因组其余部分的绝大多数。结合对基因的全面注释,可以在人类基因组数据库中找到有关基因组生物学成分的信息,例如基因多态性、非编码 RNA 和某些 RE。然而,不同 RE 组形成的独特 RE 排列的全基因组图谱尚未得到充分描述。在这项研究中,对整个人类基因组进行了无偏 RE 调查,以建立 RE 及其排列的全基因组图谱。由于用于 RE 调查的 bl2seq 比对程序(国家生物技术信息中心 [NCBI])中的查询大小限制,整个 NCBI 参考人类基因组被分割成 6206 个 0.5M 核苷酸的单位。在整个基因组中鉴定出具有不同复杂性和模式的多种 RE 排列。每条染色体都具有独特的 RE 排列和密度特征,在着丝粒区域附近测量到高水平的 RE 密度。随后,对整个基因组中选择的 175 个复杂 RE 排列进行了使用五个不同人类基因组序列的比较分析。有趣的是,五个人类基因组数据库中的三个共享了所有 175 个 RE 排列区域的完全相同的排列模式和序列(总共 12,765,625 个核苷酸)。这项研究的结果表明,人类基因组中的大量 RE 聚集在各种形式的有序结构中。需要进一步研究这些有序 RE 排列是否作为一个功能基因组单元有助于人类病理生物学。