定位DNA和蛋白质序列中差异变异性区域。

Locating regions of differential variability in DNA and protein sequences.

作者信息

Tang H, Lewontin R C

机构信息

Department of Statistics, Stanford University, Stanford, California 94305, USA.

出版信息

Genetics. 1999 Sep;153(1):485-95. doi: 10.1093/genetics/153.1.485.

DOI:10.1093/genetics/153.1.485

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1460758/

Abstract

In the comparison of DNA and protein sequences between species or between paralogues or among individuals within a species or population, there is often some indication that different regions of the sequence are divergent or polymorphic to different degrees, indicating differential constraint or diversifying selection operating in different regions of the sequence. The problem is to test statistically whether the observed regional differences in the density of variant sites represent real differences and then to estimate as accurately as possible the location of the differential regions. A method is given for testing and locating regions of differential variation. The method consists of calculating G(x(k)) = k/n - x(k)/N, where x(k) is the position of the kth variant site along the sequence, n is the total number of variant sites, and N is the total sequence length. The estimated region is the longest stretch of adjacent sequence for which G(x(k)) is monotonically increasing (a hot spot) or decreasing (a cold spot). Critical values of this length for tests of significance are given, a sequential method is developed for locating multiple differential regions, and the power of the method against various alternatives is explored. The method locates the endpoints of hot spots and cold spots of variation with high accuracy.

摘要

在比较物种之间、旁系同源物之间或物种或群体内个体之间的DNA和蛋白质序列时，常常有迹象表明序列的不同区域在不同程度上存在差异或多态性，这表明在序列的不同区域存在差异约束或多样化选择。问题在于通过统计学方法检验观察到的变异位点密度的区域差异是否代表真实差异，然后尽可能准确地估计差异区域的位置。本文给出了一种用于检验和定位差异变异区域的方法。该方法包括计算G(x(k)) = k/n - x(k)/N，其中x(k)是第k个变异位点在序列中的位置，n是变异位点的总数，N是序列的总长度。估计区域是G(x(k))单调递增（热点）或单调递减（冷点）的最长相邻序列片段。给出了用于显著性检验的该长度的临界值，开发了一种用于定位多个差异区域的序贯方法，并探讨了该方法针对各种替代情况的功效。该方法能够高精度地定位变异热点和冷点的端点。

相似文献

1

Locating regions of differential variability in DNA and protein sequences.

Genetics. 1999 Sep;153(1):485-95. doi: 10.1093/genetics/153.1.485.

2

Nucleotide variation at the runt locus in Drosophila melanogaster and Drosophila simulans.

Mol Biol Evol. 1999 Jun;16(6):724-31. doi: 10.1093/oxfordjournals.molbev.a026157.

3

DNA sequence variation at a duplicated gene: excess of replacement polymorphism and extensive haplotype structure in the Drosophila melanogaster bicoid region.

Mol Biol Evol. 2002 Jul;19(7):989-98. doi: 10.1093/oxfordjournals.molbev.a004179.

4

Nucleotide variation along the Drosophila melanogaster fourth chromosome.

Science. 2002 Jan 4;295(5552):134-7. doi: 10.1126/science.1064521.

5

Intraspecific and interspecific variation at the y-ac-sc region of Drosophila simulans and Drosophila melanogaster.

Genetics. 1992 Apr;130(4):805-16. doi: 10.1093/genetics/130.4.805.

6

Evolutionary inferences from DNA variation at the 6-phosphogluconate dehydrogenase locus in natural populations of drosophila: selection and geographic differentiation.

Genetics. 1994 Jan;136(1):155-71. doi: 10.1093/genetics/136.1.155.

7

Detecting heterogeneity of substitution along DNA and protein sequences.

Genetics. 1996 May;143(1):589-602. doi: 10.1093/genetics/143.1.589.

8

Patterns of evolutionary constraints in intronic and intergenic DNA of Drosophila.

Genome Res. 2004 Feb;14(2):273-9. doi: 10.1101/gr.1329204.

9

Detection of Regional Variation in Selection Intensity within Protein-Coding Genes Using DNA Sequence Polymorphism and Divergence.

Mol Biol Evol. 2017 Nov 1;34(11):3006-3022. doi: 10.1093/molbev/msx213.

10

Divergence of the yellow gene between Drosophila melanogaster and D. subobscura: recombination rate, codon bias and synonymous substitutions.

Genetics. 1997 Sep;147(1):165-75. doi: 10.1093/genetics/147.1.165.

引用本文的文献

1

Rapid evolution of promoters from germline-specifically expressed genes including transposon silencing factors.

BMC Genomics. 2024 Jul 8;25(1):678. doi: 10.1186/s12864-024-10584-9.

2

Indel driven rapid evolution of core nuclear pore protein gene promoters.

Sci Rep. 2023 May 17;13(1):8035. doi: 10.1038/s41598-023-34985-0.

3

Testes Proteases Expression and Hybrid Male Sterility Between Subspecies of .

G3 (Bethesda). 2019 Apr 9;9(4):1065-1074. doi: 10.1534/g3.119.300580.

4

Mutational signatures and mutable motifs in cancer genomes.

Brief Bioinform. 2018 Nov 27;19(6):1085-1101. doi: 10.1093/bib/bbx049.

5

Genome Hotspots for Nucleotide Substitutions and the Evolution of Influenza A (H1N1) Human Strains.

Genome Biol Evol. 2016 Apr 8;8(4):986-93. doi: 10.1093/gbe/evw061.

6

Disruption of Transcriptional Coactivator Sub1 Leads to Genome-Wide Re-distribution of Clustered Mutations Induced by APOBEC in Active Yeast Genes.

PLoS Genet. 2015 May 5;11(5):e1005217. doi: 10.1371/journal.pgen.1005217. eCollection 2015 May.

7

Ascertaining regions affected by GC-biased gene conversion through weak-to-strong mutational hotspots.

Genomics. 2014 May-Jun;103(5-6):349-56. doi: 10.1016/j.ygeno.2014.04.001. Epub 2014 Apr 13.

8

Bidirectional promoters as important drivers for the emergence of species-specific transcripts.

PLoS One. 2013;8(2):e57323. doi: 10.1371/journal.pone.0057323. Epub 2013 Feb 27.

9

DNA sequence polymorphism of the Rhg4 candidate gene conferring resistance to soybean cyst nematode in Chinese domesticated and wild soybeans.

Mol Breed. 2012 Aug;30(2):1155-1162. doi: 10.1007/s11032-012-9703-1. Epub 2012 Feb 18.

10

Maximum-likelihood model averaging to profile clustering of site types across discrete linear sequences.

PLoS Comput Biol. 2009 Jun;5(6):e1000421. doi: 10.1371/journal.pcbi.1000421. Epub 2009 Jun 26.

本文引用的文献

1

Nucleotide variation and conservation at the dpp locus, a gene controlling early development in Drosophila.

Genetics. 1997 Feb;145(2):311-23. doi: 10.1093/genetics/145.2.311.

2

Detecting heterogeneity of substitution along DNA and protein sequences.

Genetics. 1996 May;143(1):589-602. doi: 10.1093/genetics/143.1.589.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。