Suppr超能文献

基于 Sanger 和焦磷酸测序方法的 EST 数据的生物信息学分析:栎树。

Bioinformatic analysis of ESTs collected by Sanger and pyrosequencing methods for a keystone forest tree species: oak.

机构信息

INRA, UMR 1202 BIOGECO, 69 route d'Arcachon, F-33612 Cestas, France.

出版信息

BMC Genomics. 2010 Nov 23;11:650. doi: 10.1186/1471-2164-11-650.

Abstract

BACKGROUND

The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the Quercus family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity.

RESULTS

We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0%) were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts). We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7%) unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these traits. Comparative orthologous sequences (COS) with other plant gene models were identified and allow to unravel the oak paleo-history. Simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were searched, resulting in 52,834 SSRs and 36,411 SNPs. All of these are available through the Oak Contig Browser http://genotoul-contigbrowser.toulouse.inra.fr:9092/Quercus_robur/index.html.

CONCLUSIONS

This genomic resource provides a unique tool to discover genes of interest, study the oak transcriptome, and develop new markers to investigate functional diversity in natural populations.

摘要

背景

壳斗科植物约有 1000 种,分布于世界各地,约有一半属于栎属。这些橡树通常是生物质木材和纤维的原料。栓皮栎和麻栎是欧洲最重要的落叶林树种之一。尽管这些物种具有生态和经济重要性,但针对它们的基因组资源却很少。在这里,我们描述了一个 EST 目录的开发,该目录将支持生态系统基因组学研究,遗传学家、生理生态学家、分子生物学家和生态学家共同努力,以了解、监测和预测功能基因多样性。

结果

我们使用 Sanger 法从 20 个 cDNA 文库中生成了 145827 条序列。不可用的色谱图和质量检查导致我们排除了 19941 条序列。最终,从 111361 个 cDNA 克隆中保留了总共 125925 个 EST。还对 14 个文库进行了焦磷酸测序,生成了 1948579 条reads,从中排除了 370566 条序列(19.0%),得到 1578192 条序列。使用 TGICL 管道进行聚类和组装后,1704117 个 EST 序列聚类成 69154 个暂定的连续体和 153517 个单体,提供了 222671 个非冗余序列(包括替代转录物)。我们还使用 MIRA 和 PartiGene 软件对序列进行了组装,并比较了三个基因集。然后将基因本体注释分配给 29303 个基因元素。与 SWISS-PROT 数据库的 Blast 搜索显示,32810 个(14.7%)基因元素可能有同源物,但使用 Pfam、Refseq_protein、Refseq_RNA 和八个基因索引进行更广泛的搜索显示,它们中有 67.4%有同源物。对候选基因参与芽物候、角质层形成、苯丙烷生物合成和细胞壁形成的推定同源物进行了检查。我们的结果表明,这些基因的覆盖度较好。与其他植物基因模型的比较直系同源物(COS)被鉴定出来,这使得橡树的古历史得以揭示。搜索了简单重复序列(SSR)和单核苷酸多态性(SNP),得到了 52834 个 SSR 和 36411 个 SNP。所有这些都可以通过 Oak Contig Browser http://genotoul-contigbrowser.toulouse.inra.fr:9092/Quercus_robur/index.html 获得。

结论

该基因组资源为发现感兴趣的基因、研究橡树转录组以及开发新的标记来研究自然种群的功能多样性提供了一个独特的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b025/3017864/3769ffb4e7f3/1471-2164-11-650-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验