Brent Michael R
Center for Genome Sciences, Campus BOX 8510, Washington University, 4444 Forest Park Blvd, Saint Louis, Missouri 63108, USA.
Nat Rev Genet. 2008 Jan;9(1):62-73. doi: 10.1038/nrg2220.
The sequencing of large, complex genomes has become routine, but understanding how sequences relate to biological function is less straightforward. Although much attention is focused on how to annotate genomic features such as developmental enhancers and non-coding RNAs, there is still no higher eukaryote for which we know the correct exon-intron structure of at least one ORF for each gene. Despite this uncomfortable truth, genome annotation has made remarkable progress since the first drafts of the human genome were analysed. By combining several computational and experimental methods, we are now closer to producing complete and accurate gene catalogues than ever before.
对大型复杂基因组进行测序已成为常规操作,但理解序列与生物学功能之间的关系却没那么简单直接。尽管目前很多注意力都集中在如何注释诸如发育增强子和非编码RNA等基因组特征上,但对于任何一种高等真核生物,我们仍无法知晓每个基因至少一个开放阅读框(ORF)的正确外显子-内含子结构。尽管存在这一令人不安的事实,但自人类基因组初稿被分析以来,基因组注释已经取得了显著进展。通过结合多种计算方法和实验方法,我们如今比以往任何时候都更接近生成完整且准确的基因目录。