Karlin S, Blaisdell B E, Sapolsky R J, Cardon L, Burge C
Department of Mathematics, Stanford University, CA 94305-2125.
Nucleic Acids Res. 1993 Feb 11;21(3):703-11. doi: 10.1093/nar/21.3.703.
With the sequencing of the first complete eukaryotic chromosome, III of yeast (YCIII) of length 315 kb, several types of questions concerning chromosomal organization and the heterogeneity of eukaryotic DNA sequences can be approached. We have undertaken extensive analysis of YCIII with the goals of: (1) discerning patterns and anomalies in the occurrences of short oligonucleotides; (2) characterizing the nature and locations of significant direct and inverted repeats; (3) delimiting regions unusually rich in particular base types (e.g., G+C, purines); and (4) analyzing the distributions of markers of interest, e.g., delta (delta) elements, ARS (autonomous replicating sequences), special oligonucleotides, close repeats and close dyad pairings, and gene sequences. YCIII reveals several distinctive sequence features, including: (i) a relative abundance of significant local and global repeats highlighting five genes containing substantial close or tandem DNA repeats; (ii) an anomalous distribution of delta elements involving two clusters and a long gap; (iii) a significantly even distribution of ARS; (iv) a relative increase in the frequency of T runs and AT iterations downstream of genes and A runs upstream of genes; and (v) two regions of complex repetitive sequences and anomalous DNA composition, 29000-31000 and 291000-295000, the latter centered at the HMRa locus. Interpretations of these findings for chromosomal organization and implications for regulation of gene expression are discussed.
随着第一条完整的真核染色体——长度为315 kb的酵母III号染色体(YCIII)的测序完成,可以着手研究有关染色体组织和真核DNA序列异质性的几种类型的问题。我们对YCIII进行了广泛分析,目标如下:(1)辨别短寡核苷酸出现的模式和异常情况;(2)表征重要的正向和反向重复序列的性质和位置;(3)界定特定碱基类型(如G+C、嘌呤)异常丰富的区域;(4)分析感兴趣的标记物的分布,如δ(delta)元件、自主复制序列(ARS)、特殊寡核苷酸、紧密重复序列和紧密二分体配对以及基因序列。YCIII揭示了几个独特的序列特征,包括:(i)重要的局部和全局重复序列相对丰富,突出了五个包含大量紧密或串联DNA重复序列的基因;(ii)δ元件的异常分布,涉及两个簇和一个长间隙;(iii)ARS的分布明显均匀;(iv)基因下游T串和AT重复频率相对增加,基因上游A串频率相对增加;(v)两个复杂重复序列和异常DNA组成的区域,29000 - 31000和291000 - 295000,后者以HMRa位点为中心。本文讨论了这些发现对染色体组织的解释以及对基因表达调控的影响。