Jiang Liang, Liu Qiong
College of Life Sciences and Oceanography, Shenzhen University, Nanhai Avenue 3688, Shenzhen, 518060, China.
Methods Mol Biol. 2018;1661:29-39. doi: 10.1007/978-1-4939-7258-6_3.
Computational methods for identifying selenoproteins have been developed rapidly in recent years. However, it is still difficult to identify the open reading frame (ORF) of eukaryotic selenoprotein gene, because the TGA codon for a selenocysteine (Sec) residue in the active center of selenoprotein is traditionally a terminal signal of protein translation. A gene assembly algorithm SelGenAmic has been constructed and presented in this chapter for identifying selenoprotein genes from eukaryotic genomes. A method based on this algorithm was developed to build an optimal TGA-containing-ORF for each TGA in a genome, followed by protein similarity analysis through conserved sequence alignments to screen out selenoprotein genes from these ORFs. This method improved the sensitivity of detecting selenoproteins from a genome due to the design that all TGAs in the genome were investigated for its possibility of decoding as a Sec residue. The method based on the SelGenAmic algorithm is capable of identifying eukaryotic selenoprotein genes from their genomes.
近年来,用于识别硒蛋白的计算方法发展迅速。然而,真核生物硒蛋白基因的开放阅读框(ORF)仍然难以识别,因为硒蛋白活性中心的硒代半胱氨酸(Sec)残基的TGA密码子传统上是蛋白质翻译的终止信号。本章构建并介绍了一种基因组装算法SelGenAmic,用于从真核生物基因组中识别硒蛋白基因。基于该算法开发了一种方法,为基因组中的每个TGA构建一个最佳的含TGA的ORF,然后通过保守序列比对进行蛋白质相似性分析,从这些ORF中筛选出硒蛋白基因。由于该设计对基因组中的所有TGA进行了作为Sec残基解码的可能性研究,该方法提高了从基因组中检测硒蛋白的灵敏度。基于SelGenAmic算法的方法能够从真核生物基因组中识别硒蛋白基因。