Suppr超能文献

埃及伊蚊基因组14个细菌人工染色体序列分析:基因组注释与组装的基准

Analysis of 14 BAC sequences from the Aedes aegypti genome: a benchmark for genome annotation and assembly.

作者信息

Lobo Neil F, Campbell Kathy S, Thaner Daniel, Debruyn Becky, Koo Hean, Gelbart William M, Loftus Brendan J, Severson David W, Collins Frank H

机构信息

Center for Global Health and Infectious Diseases, Department of Biological Sciences, University of Notre Dame, Notre Dame, IN 46556-0369, USA.

出版信息

Genome Biol. 2007;8(5):R88. doi: 10.1186/gb-2007-8-5-r88.

Abstract

BACKGROUND

Aedes aegypti is the principal vector of yellow fever and dengue viruses throughout the tropical world. To provide a set of manually curated and annotated sequences from the Ae. aegypti genome, 14 mapped bacterial artificial chromosome (BAC) clones encompassing 1.57 Mb were sequenced, assembled and manually annotated using a combination of computational gene-finding, expressed sequence tag (EST) matches and comparative protein homology. PCR and sequencing were used to experimentally confirm expression and sequence of a subset of these transcripts.

RESULTS

Of the 51 manual annotations, 50 and 43 demonstrated a high level of similarity to Anopheles gambiae and Drosophila melanogaster genes, respectively. Ten of the 12 BAC sequences with more than one annotated gene exhibited synteny with the A. gambiae genome. Putative transcripts from eight BAC clones were found in multiple copies (two copies in most cases) in the Aedes genome assembly, which point to the probable presence of haplotype polymorphisms and/or misassemblies.

CONCLUSION

This study not only provides a benchmark set of manually annotated transcripts for this genome that can be used to assess the quality of the auto-annotation pipeline and the assembly, but it also looks at the effect of a high repeat content on the genome assembly and annotation pipeline.

摘要

背景

埃及伊蚊是整个热带地区黄热病病毒和登革热病毒的主要传播媒介。为了提供一组来自埃及伊蚊基因组的人工编辑和注释序列,对包含1.57 Mb的14个定位细菌人工染色体(BAC)克隆进行了测序、组装,并结合计算基因发现、表达序列标签(EST)匹配和比较蛋白质同源性进行了人工注释。使用PCR和测序实验性地确认了这些转录本子集的表达和序列。

结果

在51个人工注释中,分别有50个和43个与冈比亚按蚊和黑腹果蝇基因具有高度相似性。12个具有多个注释基因的BAC序列中有10个与冈比亚按蚊基因组表现出共线性。在埃及伊蚊基因组组装中发现来自8个BAC克隆的推定转录本有多个拷贝(大多数情况下为两个拷贝),这表明可能存在单倍型多态性和/或组装错误。

结论

本研究不仅为该基因组提供了一组人工注释转录本的基准集,可用于评估自动注释流程和组装的质量,还研究了高重复含量对基因组组装和注释流程的影响。

相似文献

5
Aedes aegypti genomics.埃及伊蚊基因组学。
Insect Biochem Mol Biol. 2004 Jul;34(7):715-21. doi: 10.1016/j.ibmb.2004.03.024.

本文引用的文献

5
Apollo: a sequence annotation editor.阿波罗:一个序列注释编辑器。
Genome Biol. 2002;3(12):RESEARCH0082. doi: 10.1186/gb-2002-3-12-research0082. Epub 2002 Dec 23.
10
The genome sequence of Drosophila melanogaster.黑腹果蝇的基因组序列。
Science. 2000 Mar 24;287(5461):2185-95. doi: 10.1126/science.287.5461.2185.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验