Besnier Francois, Carlborg Orjan
Linnaeus Centre for Bioinformatics, Uppsala University, SE-75124 Uppsala, Sweden.
BMC Bioinformatics. 2007 Nov 13;8:440. doi: 10.1186/1471-2105-8-440.
Identity by descent (IBD) matrix estimation is a central component in mapping of Quantitative Trait Loci (QTL) using variance component models. A large number of algorithms have been developed for estimation of IBD between individuals in populations at discrete locations in the genome for use in genome scans to detect QTL affecting various traits of interest in experimental animal, human and agricultural pedigrees. Here, we propose a new approach to estimate IBD as continuous functions rather than as discrete values.
Estimation of IBD functions improved the computational efficiency and memory usage in genome scanning for QTL. We have explored two approaches to obtain continuous marker-bracket IBD-functions. By re-implementing an existing and fast deterministic IBD-estimation method, we show that this approach results in IBD functions that produces the exact same IBD as the original algorithm, but with a greater than 2-fold improvement of the computational efficiency and a considerably lower memory requirement for storing the resulting genome-wide IBD. By developing a general IBD function approximation algorithm, we show that it is possible to estimate marker-bracket IBD functions from IBD matrices estimated at marker locations by any existing IBD estimation algorithm. The general algorithm provides approximations that lead to QTL variance component estimates that even in worst-case scenarios are very similar to the true values. The approach of storing IBD as polynomial IBD-function was also shown to reduce the amount of memory required in genome scans for QTL.
In addition to direct improvements in computational and memory efficiency, estimation of IBD-functions is a fundamental step needed to develop and implement new efficient optimization algorithms for high precision localization of QTL. Here, we discuss and test two approaches for estimating IBD functions based on existing IBD estimation algorithms. Our approaches provide immediately useful techniques for use in single QTL analyses in the variance component QTL mapping framework. They will, however, be particularly useful in genome scans for multiple interacting QTL, where the improvements in both computational and memory efficiency are the key for successful development of efficient optimization algorithms to allow widespread use of this methodology.
通过系谱同一性(IBD)矩阵估计是使用方差成分模型进行数量性状基因座(QTL)定位的核心组成部分。已经开发了大量算法,用于估计基因组中离散位置的群体中个体之间的IBD,以用于基因组扫描,以检测影响实验动物、人类和农业系谱中各种感兴趣性状的QTL。在此,我们提出一种新方法,将IBD估计为连续函数而非离散值。
IBD函数估计提高了QTL基因组扫描中的计算效率和内存使用。我们探索了两种获得连续标记区间IBD函数的方法。通过重新实现一种现有的快速确定性IBD估计方法,我们表明该方法产生的IBD函数与原始算法产生的IBD完全相同,但计算效率提高了两倍以上,并且存储全基因组IBD所需的内存要求大大降低。通过开发一种通用的IBD函数近似算法,我们表明可以从任何现有IBD估计算法在标记位置估计的IBD矩阵中估计标记区间IBD函数。通用算法提供的近似值导致QTL方差成分估计值,即使在最坏情况下也与真实值非常相似。将IBD存储为多项式IBD函数的方法也被证明可以减少QTL基因组扫描所需的内存量。
除了直接提高计算和内存效率外,IBD函数估计是开发和实施用于QTL高精度定位的新高效优化算法所需的基本步骤。在此,我们讨论并测试了基于现有IBD估计算法估计IBD函数的两种方法。我们的方法为方差成分QTL定位框架中的单QTL分析提供了立即可用的技术。然而,它们在多个相互作用QTL的基因组扫描中将特别有用,其中计算和内存效率的提高是成功开发高效优化算法以允许广泛使用该方法的关键。