哺乳动物基因组中短配对重复序列的分布

Distribution of short paired duplications in mammalian genomes.

作者信息

Thomas Elizabeth E, Srebro Nathan, Sebat Jonathan, Navin Nicholas, Healy John, Mishra Bud, Wigler Michael

机构信息

Cold Spring Harbor Laboratory, 1 Bungtown Road, Cold Spring Harbor, NY 11724, USA.

出版信息

Proc Natl Acad Sci U S A. 2004 Jul 13;101(28):10349-54. doi: 10.1073/pnas.0403727101. Epub 2004 Jul 6.

DOI:10.1073/pnas.0403727101

PMID:15240876

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC478600/

Abstract

Mammalian genomes are densely populated with long duplicated sequences. In this paper, we demonstrate the existence of doublets, short duplications between 25 and 100 bp, distinct from previously described repeats. Each doublet is a pair of exact matches, separated by some distance. The distribution of these intermatch distances is strikingly nonrandom. An unexpectedly high number of doublets have matches either within 100 bp (adjacent) or at distances tightly concentrated approximately 1,000 bp apart (nearby). We focus our study on these proximate doublets. First, they tend to have both matches on the same strand. By comparing nearby doublets shared in human and chimpanzee, we can also see that these doublets seem to arise by an insertion event that produces a copy without markedly affecting the surrounding sequence. Most doublets in humans are shared with chimpanzee, but many new pairs arose after the divergence of the species. Doublets found in human but not chimpanzee are most often composed of almost tandem matches, whereas older doublets (found in both species) are more likely to have matches spaced by approximately 1 kb, indicating that the nearly tandem doublets may be more dynamic. The spacing of doublets is highly conserved. So far, we have found clearly recognizable doublets in the following genomes: Homo sapiens, Mus musculus, Arabidopsis thaliana, and Caenorhabditis elegans, indicating that the mechanism generating these doublets is widespread. A mechanism that generates short local duplications while conserving polarity could have a profound impact on the evolution of regulatory and protein-coding sequences.

摘要

哺乳动物基因组中密集分布着长重复序列。在本文中，我们证明了双联体的存在，即25至100个碱基对之间的短重复序列，与先前描述的重复序列不同。每个双联体是一对精确匹配序列，中间相隔一定距离。这些匹配序列间距离的分布明显是非随机的。数量出乎意料地多的双联体在100个碱基对以内（相邻）或距离紧密集中在大约1000个碱基对处（附近）有匹配序列。我们将研究重点放在这些邻近的双联体上。首先，它们往往在同一条链上都有匹配序列。通过比较人类和黑猩猩共有的邻近双联体，我们还可以看到，这些双联体似乎是由一个插入事件产生的，该事件产生了一个拷贝，而对周围序列没有明显影响。人类中的大多数双联体与黑猩猩共有，但在物种分化后出现了许多新的双联体对。在人类中发现但在黑猩猩中未发现的双联体最常由几乎串联的匹配序列组成，而较古老的双联体（在两个物种中都有发现）更有可能其匹配序列间隔约1千碱基对，这表明几乎串联的双联体可能更具动态性。双联体的间隔高度保守。到目前为止，我们已经在以下基因组中发现了清晰可辨的双联体：智人、小家鼠、拟南芥和秀丽隐杆线虫，这表明产生这些双联体的机制很普遍。一种在保留极性的同时产生短局部重复序列的机制可能对调控序列和蛋白质编码序列的进化产生深远影响。

相似文献

Distribution of short paired duplications in mammalian genomes.

Proc Natl Acad Sci U S A. 2004 Jul 13;101(28):10349-54. doi: 10.1073/pnas.0403727101. Epub 2004 Jul 6.

A genome-wide comparison of recent chimpanzee and human segmental duplications.

Nature. 2005 Sep 1;437(7055):88-93. doi: 10.1038/nature04000.

Genomic deletions and precise removal of transposable elements mediated by short identical DNA segments in primates.

Genome Res. 2005 Sep;15(9):1243-9. doi: 10.1101/gr.3910705.

A comparative analysis of numt evolution in human and chimpanzee.

Mol Biol Evol. 2007 Jan;24(1):13-8. doi: 10.1093/molbev/msl149. Epub 2006 Oct 20.

Comparative analysis of the paired immunoglobulin-like receptor (PILR) locus in six mammalian genomes: duplication, conversion, and the birth of new genes.

Physiol Genomics. 2006 Nov 27;27(3):201-18. doi: 10.1152/physiolgenomics.00284.2005. Epub 2006 Aug 22.

Short, local duplications in eukaryotic genomes.

Curr Opin Genet Dev. 2005 Dec;15(6):640-4. doi: 10.1016/j.gde.2005.09.008. Epub 2005 Oct 7.

Evolution of beta satellite DNA sequences: evidence for duplication-mediated repeat amplification and spreading.

Mol Biol Evol. 2004 Sep;21(9):1792-9. doi: 10.1093/molbev/msh190. Epub 2004 Jun 16.

The majority of recent short DNA insertions in the human genome are tandem duplications.

Mol Biol Evol. 2007 May;24(5):1190-7. doi: 10.1093/molbev/msm035. Epub 2007 Feb 24.

Extensive divergence in alternative splicing patterns after gene and genome duplication during the evolutionary history of Arabidopsis.

Mol Biol Evol. 2010 Jul;27(7):1686-97. doi: 10.1093/molbev/msq054. Epub 2010 Feb 25.

Evolutionary analysis of the highly dynamic CHEK2 duplicon in anthropoids.

BMC Evol Biol. 2008 Oct 2;8:269. doi: 10.1186/1471-2148-8-269.

引用本文的文献

Typical achondroplasia secondary to a unique insertional variant of FGFR3 with in vitro demonstration of its effect on FGFR3 function.

Am J Med Genet A. 2021 Mar;185(3):798-805. doi: 10.1002/ajmg.a.62043. Epub 2020 Dec 2.

Natural insertions in rice commonly form tandem duplications indicative of patch-mediated double-strand break induction and repair.

Proc Natl Acad Sci U S A. 2014 May 6;111(18):6684-9. doi: 10.1073/pnas.1321854111. Epub 2014 Apr 23.

An exonic insertion within Tex14 gene causes spermatogenic arrest in pigs.

BMC Genomics. 2011 Dec 2;12:591. doi: 10.1186/1471-2164-12-591.

Repair-mediated duplication by capture of proximal chromosomal DNA has shaped vertebrate genome evolution.

PLoS Genet. 2009 May;5(5):e1000469. doi: 10.1371/journal.pgen.1000469. Epub 2009 May 8.

Duplication count distributions in DNA sequences.

Phys Rev E Stat Nonlin Soft Matter Phys. 2008 Dec;78(6 Pt 1):061912. doi: 10.1103/PhysRevE.78.061912. Epub 2008 Dec 11.

Genome and gene alterations by insertions and deletions in the evolution of human and chimpanzee chromosome 22.

BMC Genomics. 2009 Jan 26;10:51. doi: 10.1186/1471-2164-10-51.

Sequence context affects the rate of short insertions and deletions in flies and primates.

Genome Biol. 2008;9(2):R37. doi: 10.1186/gb-2008-9-2-r37. Epub 2008 Feb 21.

Searching for sequence directed mutagenesis in eukaryotes.

J Mol Evol. 2007 Jan;64(1):1-3. doi: 10.1007/s00239-005-0120-5. Epub 2006 Dec 9.

Identification of the REST regulon reveals extensive transposable element-mediated binding site duplication.

Nucleic Acids Res. 2006;34(14):3862-77. doi: 10.1093/nar/gkl525. Epub 2006 Aug 9.

Chromosome localization of microsatellite markers in the shrews of the Sorex araneus group.

Chromosome Res. 2006;14(3):253-62. doi: 10.1007/s10577-006-1041-x. Epub 2006 Apr 20.

本文引用的文献

Annotating large genomes with exact word matches.

Genome Res. 2003 Oct;13(10):2306-15. doi: 10.1101/gr.1350803. Epub 2003 Sep 15.

The UCSC Genome Browser Database.

Nucleic Acids Res. 2003 Jan 1;31(1):51-4. doi: 10.1093/nar/gkg129.

Initial sequencing and comparative analysis of the mouse genome.

Nature. 2002 Dec 5;420(6915):520-62. doi: 10.1038/nature01262.

Genome sequence of the human malaria parasite Plasmodium falciparum.

Nature. 2002 Oct 3;419(6906):498-511. doi: 10.1038/nature01097.

Mammalian retroelements.

Genome Res. 2002 Oct;12(10):1455-65. doi: 10.1101/gr.282402.

Recent segmental duplications in the human genome.

Science. 2002 Aug 9;297(5583):1003-7. doi: 10.1126/science.1072047.

Segmental duplications and the evolution of the primate genome.

Nat Rev Genet. 2002 Jan;3(1):65-72. doi: 10.1038/nrg705.

Study of intrachromosomal duplications among the eukaryote genomes.

Mol Biol Evol. 2001 Dec;18(12):2280-8. doi: 10.1093/oxfordjournals.molbev.a003774.

Mobile elements and the human genome.

Nat Rev Genet. 2000 Nov;1(2):134-44. doi: 10.1038/35038572.

Initial sequencing and analysis of the human genome.

Nature. 2001 Feb 15;409(6822):860-921. doi: 10.1038/35057062.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

哺乳动物基因组中短配对重复序列的分布

Distribution of short paired duplications in mammalian genomes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献