Suppr超能文献

牛巴贝斯虫基因与启动子模型:基于全长EST分析的更新

The Babesia bovis gene and promoter model: an update from full-length EST analysis.

作者信息

Yamagishi Junya, Wakaguri Hiroyuki, Yokoyama Naoaki, Yamashita Riu, Suzuki Yutaka, Xuan Xuenan, Igarashi Ikuo

机构信息

National Research Center for Protozoan Diseases, Obihiro University of Agriculture and Veterinary Medicine, Inada-cho west 2-13, Obihiro, Hokkaido 080-8555, Japan.

出版信息

BMC Genomics. 2014 Aug 13;15(1):678. doi: 10.1186/1471-2164-15-678.

Abstract

BACKGROUND

Babesia bovis is an apicomplexan parasite that causes babesiosis in infected cattle. Genomes of pathogens contain promising information that can facilitate the development of methods for controlling infections. Although the genome of B. bovis is publically available, annotated gene models are not highly reliable prior to experimental validation. Therefore, we validated a preproposed gene model of B. bovis and extended the associated annotations on the basis of experimentally obtained full-length expressed sequence tags (ESTs).

RESULTS

From in vitro cultured merozoites, 12,286 clones harboring full-length cDNAs were sequenced from both ends using the Sanger method, and 6,787 full-length cDNAs were assembled. These were then clustered, and a nonredundant referential data set of 2,115 full-length cDNA sequences was constructed. The comparison of the preproposed gene model with our data set identified 310 identical genes, 342 almost identical genes, 1,054 genes with potential structural inconsistencies, and 409 novel genes. The median length of 5' untranslated regions (UTRs) was 152 nt. Subsequently, we identified 4,086 transcription start sites (TSSs) and 2,023 transcriptionally active regions (TARs) by examining 5' ESTs. We identified ATGGGG and CCCCAT sites as consensus motifs in TARs that were distributed around -50 bp from TSSs. In addition, we found ACACA, TGTGT, and TATAT sites, which were distributed periodically around TSSs in cycles of approximately 150 bp. Moreover, related periodical distributions were not observed in mammalian promoter regions.

CONCLUSIONS

The observations in this study indicate the utility of integrated bioinformatics and experimental data for improving genome annotations. In particular, full-length cDNAs with one-base resolution for TSSs enabled the identification of consensus motifs in promoter sequences and demonstrated clear distributions of identified motifs. These observations allowed the illustration of a model promoter composition, which supports the differences in transcriptional regulation frameworks between apicomplexan parasites and mammals.

摘要

背景

牛巴贝斯虫是一种顶复门寄生虫,可在感染的牛中引起巴贝斯虫病。病原体的基因组包含有价值的信息,有助于开发控制感染的方法。尽管牛巴贝斯虫的基因组已公开可用,但在实验验证之前,注释的基因模型并不可靠。因此,我们验证了预先提出的牛巴贝斯虫基因模型,并根据实验获得的全长表达序列标签(EST)扩展了相关注释。

结果

从体外培养的裂殖子中,使用桑格法对12,286个携带全长cDNA的克隆进行两端测序,并组装了6,787个全长cDNA。然后将这些序列进行聚类,构建了一个包含2,115个全长cDNA序列的非冗余参考数据集。将预先提出的基因模型与我们的数据集进行比较,鉴定出310个相同基因、342个几乎相同的基因、1,054个潜在结构不一致的基因和409个新基因。5'非翻译区(UTR)的中位数长度为152 nt。随后,通过检查5' EST,我们鉴定出4,086个转录起始位点(TSS)和2,023个转录活性区域(TAR)。我们将ATGGGG和CCCCAT位点鉴定为TAR中的共有基序,这些基序分布在距TSS约-50 bp处。此外,我们发现ACACA、TGTGT和TATAT位点,它们以约150 bp的周期在TSS周围周期性分布。而且,在哺乳动物启动子区域未观察到相关的周期性分布。

结论

本研究中的观察结果表明,整合生物信息学和实验数据对于改善基因组注释是有用的。特别是,具有单碱基分辨率的TSS全长cDNA能够鉴定启动子序列中的共有基序,并证明了所鉴定基序的清晰分布。这些观察结果有助于阐明模型启动子组成,这支持了顶复门寄生虫和哺乳动物之间转录调控框架的差异。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/bee7/4148916/160980434832/12864_2013_6376_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验