利用高通量454序列检测技术在大型复杂基因组中进行全基因组重复序列发现及拷贝数估计

Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey.

作者信息

Swaminathan Kankshita, Varala Kranthi, Hudson Matthew E

机构信息

Department of Crop Sciences, University Of Illinois, Urbana, IL 61801, USA.

出版信息

BMC Genomics. 2007 May 24;8:132. doi: 10.1186/1471-2164-8-132.

DOI:10.1186/1471-2164-8-132

PMID:17524145

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1894642/

Abstract

BACKGROUND

Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA.

RESULTS

We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis).

CONCLUSION

This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.

摘要

背景

目前已有大量的计算和数据库工具用于挖掘模式生物的基因组和遗传数据库，但对于许多具有生态或农业意义的物种，尤其是那些基因组较大的物种，可用的基因组数据却很少。使用传统测序技术进行基因组调查功能强大，特别是对于检测每个基因组中存在多个拷贝的序列。然而，这些方法耗时且存在潜在缺点。高通量454测序提供了一种替代方法，通过这种方法可以从基因组DNA的高覆盖度调查中快速且低成本地获取大量信息。

结果

我们对经过质量标准筛选的随机剪切的大豆DNA的7800万个碱基对进行了测序。对调查序列的计算分析提供了大豆中丰富的重复序列的全局信息。该序列用于确定大型基因组克隆或重叠群区域的拷贝数，并发现卫星重复序列中的高阶结构。我们创建了一个注释的在线数据库，其中包含大豆基因组中多拷贝存在的序列。测序数据的总体组成证明了焦磷酸测序对重复序列的低偏差，这与过去通过DNA重缔合动力学（Cot分析）获得的重复DNA含量估计值非常匹配。

结论

这种方法通过允许快速评估任何克隆或克隆末端序列中的拷贝数，为传统或鸟枪法基因组组装提供了潜在的帮助。此外，我们表明部分测序可以获取部分蛋白质编码序列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64eb/1894642/0cd61f3bd977/1471-2164-8-132-1.jpg

相似文献

Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey.利用高通量454序列检测技术在大型复杂基因组中进行全基因组重复序列发现及拷贝数估计

BMC Genomics. 2007 May 24;8:132. doi: 10.1186/1471-2164-8-132.

Genomic organization and evolution of the soybean SB92 satellite sequence.大豆SB92卫星序列的基因组组织与进化

Plant Mol Biol. 1995 Nov;29(4):857-62. doi: 10.1007/BF00041174.

Database of Trypanosoma cruzi repeated genes: 20,000 additional gene variants.克氏锥虫重复基因数据库：另外20000个基因变体

BMC Genomics. 2007 Oct 26;8:391. doi: 10.1186/1471-2164-8-391.

Integration of Cot analysis, DNA cloning, and high-throughput sequencing facilitates genome characterization and gene discovery.Cot分析、DNA克隆和高通量测序的整合有助于基因组特征分析和基因发现。

Genome Res. 2002 May;12(5):795-807. doi: 10.1101/gr.226102.

A genome-wide BAC-end sequence survey provides first insights into sweetpotato (Ipomoea batatas (L.) Lam.) genome composition.一项全基因组细菌人工染色体末端序列调查首次揭示了甘薯（Ipomoea batatas (L.) Lam.）的基因组组成。

BMC Genomics. 2016 Nov 21;17(1):945. doi: 10.1186/s12864-016-3302-1.

The STR120 satellite DNA of soybean: organization, evolution and chromosomal specificity.大豆的STR120卫星DNA：组织、进化及染色体特异性

Chromosome Res. 1997 Sep;5(6):363-73. doi: 10.1023/a:1018492208247.

Low-pass shotgun sequencing of the barley genome facilitates rapid identification of genes, conserved non-coding sequences and novel repeats.大麦基因组的低通量鸟枪法测序有助于快速鉴定基因、保守非编码序列和新型重复序列。

BMC Genomics. 2008 Oct 31;9:518. doi: 10.1186/1471-2164-9-518.

The Soybean Genome Database (SoyGD): a browser for display of duplicated, polyploid, regions and sequence tagged sites on the integrated physical and genetic maps of Glycine max.大豆基因组数据库（SoyGD）：一个用于在大豆（Glycine max）综合物理图谱和遗传图谱上展示重复、多倍体区域及序列标签位点的浏览器。

Nucleic Acids Res. 2006 Jan 1;34(Database issue):D758-65. doi: 10.1093/nar/gkj050.

454 sequencing put to the test using the complex genome of barley.利用大麦复杂基因组对454测序进行测试。

BMC Genomics. 2006 Oct 26;7:275. doi: 10.1186/1471-2164-7-275.

Annotation-based genome-wide SNP discovery in the large and complex Aegilops tauschii genome using next-generation sequencing without a reference genome sequence.基于注释的全基因组 SNP 发现利用下一代测序技术在没有参考基因组序列的情况下在大型复杂的粗山羊草基因组中

BMC Genomics. 2011 Jan 25;12:59. doi: 10.1186/1471-2164-12-59.

引用本文的文献

Toward a More Just System of Care in Molecular Pathology.迈向更公正的分子病理学医疗体系。

Milbank Q. 2022 Dec;100(4):1192-1242. doi: 10.1111/1468-0009.12587. Epub 2022 Dec 1.

Satellitome Analysis and Transposable Elements Comparison in Geographically Distant Populations of .地理上遥远种群中的卫星基因组分析与转座元件比较

Life (Basel). 2022 Mar 31;12(4):521. doi: 10.3390/life12040521.

The Singular Evolution of Genome Structure.基因组结构的独特演化

Front Plant Sci. 2022 Mar 31;13:869048. doi: 10.3389/fpls.2022.869048. eCollection 2022.

Fast neutron mutagenesis in soybean enriches for small indels and creates frameshift mutations.快中子诱变在大豆中丰富了小的插入缺失，并产生移码突变。

G3 (Bethesda). 2022 Feb 4;12(2). doi: 10.1093/g3journal/jkab431.

Genome assembly of the popular Korean soybean cultivar Hwangkeum.黄壳豆品种的基因组组装。

G3 (Bethesda). 2021 Sep 27;11(10). doi: 10.1093/g3journal/jkab272.

Comparative Analysis of Transposable Elements in Genus Grasshoppers Revealed That Satellite DNA Contributes to Genome Size Variation.蚱蜢属转座元件的比较分析表明卫星DNA有助于基因组大小变异。

Insects. 2021 Sep 17;12(9):837. doi: 10.3390/insects12090837.

Integrative genetic map of repetitive DNA in the sole Solea senegalensis genome shows a Rex transposon located in a proto-sex chromosome.塞内加尔鳎基因组中重复 DNA 的综合遗传图谱显示 Rex 转座子位于一个原始性染色体中。

Sci Rep. 2019 Nov 20;9(1):17146. doi: 10.1038/s41598-019-53673-6.

Specific LTR-Retrotransposons Show Copy Number Variations between Wild and Cultivated Sunflowers.特定的长末端重复序列反转录转座子在野生向日葵和栽培向日葵之间呈现拷贝数变异。

Genes (Basel). 2018 Aug 29;9(9):433. doi: 10.3390/genes9090433.

Elucidating the major hidden genomic components of the A, C, and AC genomes and their influence on Brassica evolution.阐明 A、C 和 AC 基因组的主要隐藏基因组成分及其对芸薹属进化的影响。

Sci Rep. 2017 Dec 21;7(1):17986. doi: 10.1038/s41598-017-18048-9.

Genome-wide analysis of LTR-retrotransposon diversity and its impact on the evolution of the genus Helianthus (L.).向日葵属（L.）基因组中LTR反转录转座子多样性分析及其对该属进化的影响

BMC Genomics. 2017 Aug 18;18(1):634. doi: 10.1186/s12864-017-4050-6.

本文引用的文献

Survey sequencing of soybean elucidates the genome structure, composition and identifies novel repeats.大豆的全基因组测序阐明了基因组结构、组成并鉴定出新型重复序列。

Funct Plant Biol. 2006 Aug;33(8):765-773. doi: 10.1071/FP06106.

Gene amplification of the Hps locus in Glycine max.大豆中Hps基因座的基因扩增。

BMC Plant Biol. 2006 Mar 14;6:6. doi: 10.1186/1471-2229-6-6.

Dynamic evolution at pericentromeres.着丝粒周围的动态进化

Genome Res. 2006 Mar;16(3):355-64. doi: 10.1101/gr.4399206. Epub 2006 Feb 6.

Paleopolyploidy and gene duplication in soybean and other legumes.大豆及其他豆科植物中的古多倍体现象与基因复制

Curr Opin Plant Biol. 2006 Apr;9(2):104-9. doi: 10.1016/j.pbi.2006.01.007. Epub 2006 Feb 2.

Genome sequencing in microfabricated high-density picolitre reactors.微制造高密度皮升反应器中的基因组测序

Nature. 2005 Sep 15;437(7057):376-80. doi: 10.1038/nature03959. Epub 2005 Jul 31.

Differential rates of local and global homogenization in centromere satellites from Arabidopsis relatives.拟南芥近缘种着丝粒卫星中局部和全局同质化的差异速率。

Genetics. 2005 Aug;170(4):1913-27. doi: 10.1534/genetics.104.038208. Epub 2005 Jun 3.

Features of a 103-kb gene-rich region in soybean include an inverted perfect repeat cluster of CHS genes comprising the I locus.大豆中一个103千碱基富含基因区域的特征包括由I位点组成的查尔酮合酶（CHS）基因的反向完美重复簇。

Genome. 2004 Oct;47(5):819-31. doi: 10.1139/g04-049.

The TIGR Plant Repeat Databases: a collective resource for the identification of repetitive sequences in plants.TIGR植物重复序列数据库：用于鉴定植物中重复序列的综合资源。

Nucleic Acids Res. 2004 Jan 1;32(Database issue):D360-3. doi: 10.1093/nar/gkh099.

The FHY3 and FAR1 genes encode transposase-related proteins involved in regulation of gene expression by the phytochrome A-signaling pathway.FHY3和FAR1基因编码与转座酶相关的蛋白质，这些蛋白质参与由光敏色素A信号通路调控的基因表达。

Plant J. 2003 May;34(4):453-71. doi: 10.1046/j.1365-313x.2003.01741.x.

Centromere satellites from Arabidopsis populations: maintenance of conserved and variable domains.拟南芥群体中的着丝粒卫星序列：保守结构域和可变结构域的维持

Genome Res. 2003 Feb;13(2):195-205. doi: 10.1101/gr.593403.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

利用高通量454序列检测技术在大型复杂基因组中进行全基因组重复序列发现及拷贝数估计

Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献