Suppr超能文献

果蝇全基因组鸟枪法组装中的异染色质序列。

Heterochromatic sequences in a Drosophila whole-genome shotgun assembly.

作者信息

Hoskins Roger A, Smith Christopher D, Carlson Joseph W, Carvalho A Bernardo, Halpern Aaron, Kaminker Joshua S, Kennedy Cameron, Mungall Chris J, Sullivan Beth A, Sutton Granger G, Yasuhara Jiro C, Wakimoto Barbara T, Myers Eugene W, Celniker Susan E, Rubin Gerald M, Karpen Gary H

机构信息

Department of Genome Sciences, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA.

出版信息

Genome Biol. 2002;3(12):RESEARCH0085. doi: 10.1186/gb-2002-3-12-research0085. Epub 2002 Dec 31.

Abstract

BACKGROUND

Most eukaryotic genomes include a substantial repeat-rich fraction termed heterochromatin, which is concentrated in centric and telomeric regions. The repetitive nature of heterochromatic sequence makes it difficult to assemble and analyze. To better understand the heterochromatic component of the Drosophila melanogaster genome, we characterized and annotated portions of a whole-genome shotgun sequence assembly.

RESULTS

WGS3, an improved whole-genome shotgun assembly, includes 20.7 Mb of draft-quality sequence not represented in the Release 3 sequence spanning the euchromatin. We annotated this sequence using the methods employed in the re-annotation of the Release 3 euchromatic sequence. This analysis predicted 297 protein-coding genes and six non-protein-coding genes, including known heterochromatic genes, and regions of similarity to known transposable elements. Bacterial artificial chromosome (BAC)-based fluorescence in situ hybridization analysis was used to correlate the genomic sequence with the cytogenetic map in order to refine the genomic definition of the centric heterochromatin; on the basis of our cytological definition, the annotated Release 3 euchromatic sequence extends into the centric heterochromatin on each chromosome arm.

CONCLUSIONS

Whole-genome shotgun assembly produced a reliable draft-quality sequence of a significant part of the Drosophila heterochromatin. Annotation of this sequence defined the intron-exon structures of 30 known protein-coding genes and 267 protein-coding gene models. The cytogenetic mapping suggests that an additional 150 predicted genes are located in heterochromatin at the base of the Release 3 euchromatic sequence. Our analysis suggests strategies for improving the sequence and annotation of the heterochromatic portions of the Drosophila and other complex genomes.

摘要

背景

大多数真核生物基因组包含大量富含重复序列的部分,称为异染色质,其集中在着丝粒和端粒区域。异染色质序列的重复性使其难以组装和分析。为了更好地理解黑腹果蝇基因组的异染色质成分,我们对全基因组鸟枪法序列组装的部分进行了表征和注释。

结果

WGS3是一种改进的全基因组鸟枪法组装,包含20.7 Mb的草图质量序列,在跨越常染色质的第3版序列中未出现。我们使用重新注释第3版常染色质序列时采用的方法对该序列进行注释。该分析预测了297个蛋白质编码基因和6个非蛋白质编码基因,包括已知的异染色质基因以及与已知转座元件相似的区域。基于细菌人工染色体(BAC)的荧光原位杂交分析用于将基因组序列与细胞遗传图谱相关联,以完善着丝粒异染色质的基因组定义;根据我们的细胞学定义,注释的第3版常染色质序列延伸到每个染色体臂的着丝粒异染色质中。

结论

全基因组鸟枪法组装产生了黑腹果蝇异染色质重要部分的可靠草图质量序列。对该序列的注释确定了30个已知蛋白质编码基因和267个蛋白质编码基因模型的内含子-外显子结构。细胞遗传图谱表明,另外150个预测基因位于第3版常染色质序列底部的异染色质中。我们的分析提出了改进果蝇和其他复杂基因组异染色质部分序列和注释的策略。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dc08/151187/dcbf96af5ca5/gb-2002-3-12-research0085-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验