York Thomas L, Durrett Rick, Nielsen Rasmus
Department of Biological Statistics and Computational Biology, Cornell University, Ithaca, USA.
BMC Bioinformatics. 2007 Apr 3;8:115. doi: 10.1186/1471-2105-8-115.
We develop a Bayesian method based on MCMC for estimating the relative rates of pericentric and paracentric inversions from marker data from two species. The method also allows estimation of the distribution of inversion tract lengths.
We apply the method to data from Drosophila melanogaster and D. yakuba. We find that pericentric inversions occur at a much lower rate compared to paracentric inversions. The average paracentric inversion tract length is approx. 4.8 Mb with small inversions being more frequent than large inversions. If the two breakpoints defining a paracentric inversion tract are uniformly and independently distributed over chromosome arms there will be more short tract-length inversions than long; we find an even greater preponderance of short tract lengths than this would predict. Thus there appears to be a correlation between the positions of breakpoints which favors shorter tract lengths.
The method developed in this paper provides the first statistical estimator for estimating the distribution of inversion tract lengths from marker data. Application of this method for a number of data sets may help elucidate the relationship between the length of an inversion and the chance that it will get accepted.
我们开发了一种基于马尔可夫链蒙特卡罗(MCMC)的贝叶斯方法,用于从两个物种的标记数据估计着丝粒周围和臂内倒位的相对速率。该方法还允许估计倒位片段长度的分布。
我们将该方法应用于黑腹果蝇和雅库布果蝇的数据。我们发现,与臂内倒位相比,着丝粒周围倒位的发生率要低得多。臂内倒位片段的平均长度约为4.8兆碱基,小倒位比大倒位更频繁。如果定义臂内倒位片段的两个断点在染色体臂上均匀且独立分布,那么短片段长度的倒位会比长片段更多;我们发现短片段长度的优势比这一预测还要大得多。因此,断点位置之间似乎存在一种有利于较短片段长度的相关性。
本文开发的方法提供了第一个从标记数据估计倒位片段长度分布的统计估计器。将该方法应用于多个数据集可能有助于阐明倒位长度与其被接受的可能性之间的关系。