Department of Life Science (BK21 Program), Chung-Ang University, Seoul, Korea.
Bioinformatics. 2011 Jan 1;27(1):14-21. doi: 10.1093/bioinformatics/btq612. Epub 2010 Oct 29.
Many genes in the human genome produce a wide variety of transcript variants resulting from alternative exon splicing, differential promoter usage, or altered polyadenylation site utilization that may function differently in human cells. Here, we present a bioinformatics method for the systematic identification of human-specific novel transcript variants that might have arisen after the human-chimpanzee divergence.
The procedure involved collecting genomic insertions that are unique to the human genome when compared with orthologous chimpanzee and rhesus macaque genomic regions, and that are expressed in the transcriptome as exons evidenced by mRNAs and/or expressed sequence tags (ESTs). Using this procedure, we identified 112 transcript variants that are specific to humans; 74 were associated with known genes and the remaining transcripts were located in unannotated genomic loci. The original source of inserts was mostly transposable elements including L1, Alu, SVA, and human endogenous retroviruses (HERVs). Interestingly, some non-repetitive genomic segments were also involved in the generation of novel transcript variants. Insert contributions to the transcripts included promoters, terminal exons and insertions in exons, splice donors and acceptors and complete exon cassettes. Comparison of personal genomes revealed that at least seven loci were polymorphic in humans. The exaptation of human-specific genomic inserts as novel transcript variants may have increased human gene versatility or affected gene regulation.
人类基因组中的许多基因产生了广泛的转录变体,这些转录变体是由于外显子剪接、启动子使用的差异或多聚腺苷酸化位点利用的改变而产生的,它们在人类细胞中的功能可能不同。在这里,我们提出了一种生物信息学方法,用于系统地识别人类特异性的新转录变体,这些转录变体可能是在人类和黑猩猩分化后产生的。
该程序涉及收集与同源黑猩猩和恒河猴基因组区域相比,在人类基因组中特有的插入序列,并且这些插入序列在转录组中作为外显子表达,证据是 mRNA 和/或表达序列标签(EST)。使用这种程序,我们鉴定了 112 种特定于人类的转录变体;其中 74 种与已知基因相关,其余转录本位于未注释的基因组位点。插入的原始来源主要是转座元件,包括 L1、Alu、SVA 和人类内源性逆转录病毒(HERV)。有趣的是,一些非重复的基因组片段也参与了新转录变体的产生。插入对转录本的贡献包括启动子、末端外显子和外显子中的插入、剪接供体和受体以及完整的外显子盒。个人基因组的比较表明,至少有七个基因座在人类中是多态的。人类特异性基因组插入作为新的转录变体的外适应可能增加了人类基因的多功能性或影响了基因调控。