Program in Computational Biology and Bioinformatics, Yale University, New Haven, Connecticut 06520, USA.
Nat Rev Genet. 2010 Aug;11(8):559-71. doi: 10.1038/nrg2814. Epub 2010 Jul 13.
Most of the human genome consists of non-protein-coding DNA. Recently, progress has been made in annotating these non-coding regions through the interpretation of functional genomics experiments and comparative sequence analysis. One can conceptualize functional genomics analysis as involving a sequence of steps: turning the output of an experiment into a 'signal' at each base pair of the genome; smoothing this signal and segmenting it into small blocks of initial annotation; and then clustering these small blocks into larger derived annotations and networks. Finally, one can relate functional genomics annotations to conserved units and measures of conservation derived from comparative sequence analysis.
人类基因组的大部分由非蛋白编码 DNA 组成。最近,通过对功能基因组实验和比较序列分析的解释,在注释这些非编码区域方面取得了进展。可以将功能基因组分析概念化为涉及一系列步骤:将实验的输出转化为基因组每个碱基对的“信号”;对该信号进行平滑处理,并将其划分为初始注释的小块;然后将这些小块聚类成较大的衍生注释和网络。最后,可以将功能基因组注释与来自比较序列分析的保守单元和保守度量相关联。