Beijing Institute of Genomics, Chinese Academy of Sciences (CAS), Beijing, 100029, China.
College of Life Science, Graduate University of Chinese Academy of Sciences, Beijing, 100049, China.
Sci China Life Sci. 2010 Jan;53(1):107-111. doi: 10.1007/s11427-010-0001-z. Epub 2010 Feb 12.
A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.
大熊猫的 10 倍重叠 BAC 文库构建完成,从中挑选了 9 个 BAC 进行末端测序。这些 BAC 可作为 Illumina GA 测序技术新生成的大熊猫全基因组鸟枪法测序reads 从头组装准确性的验证资源。对联合长度为 878kb 的 9 个 BAC 进行了完整的 Sanger 测序、组装、注释和比较分析。采用同源搜索和从头预测方法对基因和重复序列进行注释。预测到 12 个蛋白编码基因,其中 7 个可以进行功能注释。7 个基因的平均基因大小约为 41kb,平均编码大小约为 1.2kb,每个基因的外显子数平均为 6 个。此外,还发现了 7 个 tRNA 基因。约 27%的 BAC 序列由重复序列组成。使用邻接法构建了包括大熊猫、人、狗、猫和鼠在内的 5 个物种的系统发育树,再次证实狗是与大熊猫最相关的物种。我们的结果为大熊猫的新基因和重复序列提供了详细的序列和结构信息,这将有助于进一步研究大熊猫。