Lawrence C E, Reilly A A
J Theor Biol. 1985 Apr 7;113(3):425-39. doi: 10.1016/s0022-5193(85)80031-x.
A statistical method is presented for comparing protein sequences by partitioning the polymers and estimating each subsegment's degree of conservation. Conservation is measured as a function of the number of transitions occurring in the underlying time homogeneous Markov process assumed to govern amino acid mutations. The Markovian assumption also permits estimation of the ancestral sequence. Partitioning and estimation are carried out via maximum likelihood. The method is contrasted with the commonly utilized percent homology measure. A moving likelihood ratio plot to aid in identifying regions of high conservation is suggested as an analogue to moving hydrophobicity plots. An application is presented which identifies highly conserved regions in thymidylate synthase from L. casei and E. coli.
本文提出了一种统计方法,通过对聚合物进行划分并估计每个子片段的保守程度来比较蛋白质序列。保守程度通过假定控制氨基酸突变的基础时间齐次马尔可夫过程中发生的转换数量来衡量。马尔可夫假设还允许对祖先序列进行估计。划分和估计通过最大似然法进行。该方法与常用的百分比同源性度量进行了对比。作为移动疏水性图的类似物,建议使用移动似然比图来帮助识别高保守区域。给出了一个应用实例,该实例识别了干酪乳杆菌和大肠杆菌胸苷酸合成酶中的高度保守区域。