Harrison Paul M, Gerstein Mark
Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06520-8114, USA.
Genome Biol. 2003;4(6):R40. doi: 10.1186/gb-2003-4-6-r40. Epub 2003 May 30.
We have derived a novel method to assess compositional biases in biological sequences, which is based on finding the lowest-probability subsequences for a given residue-type set. As a case study, the distribution of prion-like glutamine/asparagine-rich ((Q+N)-rich) domains (which are linked to amyloidogenesis) was assessed for budding and fission yeasts and four other eukaryotes. We find more than 170 prion-like (Q+N)-rich regions in budding yeast, and, strikingly, many fewer in fission yeast. Also, some residues, such as tryptophan or isoleucine, are unlikely to form biased regions in any eukaryotic proteome.
我们推导了一种评估生物序列组成偏差的新方法,该方法基于为给定的残基类型集找到概率最低的子序列。作为一个案例研究,我们评估了芽殖酵母和裂殖酵母以及其他四种真核生物中与淀粉样蛋白形成相关的类朊病毒富含谷氨酰胺/天冬酰胺((Q+N)富集)结构域的分布。我们在芽殖酵母中发现了170多个类朊病毒(Q+N)富集区域,而在裂殖酵母中则明显少得多。此外,一些残基,如色氨酸或异亮氨酸,在任何真核生物蛋白质组中都不太可能形成偏差区域。