Department of Statistics, University of Rajshahi, Rajshahi, Bangladesh.
Institute of Bioinformatics, Zhejiang University, Hangzhou, China.
PLoS One. 2018 Dec 3;13(12):e0208234. doi: 10.1371/journal.pone.0208234. eCollection 2018.
Interval mapping approaches have been playing significant role for quantitative trait locus (QTL) mapping to discover genetic architecture of diseases or traits with molecular markers. Composite interval mapping (CIM) is one of the superior approaches of the interval mapping for discovering both linked and unlinked putative QTL positions. However, estimators of this approach are not robust against phenotypic outliers. As a result, it fails to detect true QTL positions in presence of outliers. In this study, we investigated the performance of β-Composite Interval Mapping (BetaCIM) for detecting both linked and unlinked important QTLs positions from the robustness points of views. Performance of this approach depends on the value of tuning parameter β. It reduces to the classical CIM approach for β →0. We described and formulated the cross-validation procedure for selecting trait specific optimum value of β. It was observed that the optimum value of β depends on both amount of contaminated observations and their scatteredness. BetaCIM approach discover similar QTL positions as classical IM/CIM in absence of phenotypic outliers, but gives better results in presence of phenotypic outliers in terms of detecting true QTLs and effects estimation. We formulated the generalized forms of robust QTL analysis and developed an R-package named "BetaCIM" by implementing this robust approach. Left and right kidney weight data sets of mouse intercross population (129 S1/SvlmJ × A/J) were analyzed by using BetaCIM, CIM, and IM approaches. For right kidney weight (RKW) CIM and BetaCIM provided similar LOD score profile, and both approaches identified 3 QTL positions. IM approach also identified 3 QTL positions. For left kidney weight (LKW), there was evidence of one outlying observation; and in this case the BetaCIM approach identified 2 QTL positions. However, none of the QTLs were significant by CIM and IM approaches at 5% level of significance. Gene expression ontology (GEO) search showed that the candidate genes (Otof and A330033J07Rik) of the identified QTLs for LKW were expressed in kidney. Both simulation and real data analysis results showed that BetaCIM approach improves the performance over the existing methods in presence of phenotypic outliers. Otherwise, it keeps almost equal performance.
区间作图方法在利用分子标记发现疾病或性状的数量性状基因座(QTL)遗传结构方面发挥了重要作用。复合区间作图(CIM)是区间作图中发现连锁和非连锁假定 QTL 位置的一种优越方法。然而,该方法的估计量对表型异常值不稳健。因此,在存在异常值的情况下,它无法检测到真正的 QTL 位置。在这项研究中,我们从稳健性的角度研究了β-复合区间作图(BetaCIM)检测连锁和非连锁重要 QTL 位置的性能。该方法的性能取决于调整参数β的值。当β→0 时,它简化为经典的 CIM 方法。我们描述并制定了用于选择特定性状最优β值的交叉验证过程。观察到β的最优值既取决于受污染观测值的数量,也取决于它们的分散程度。在不存在表型异常值的情况下,BetaCIM 方法与经典的 IM/CIM 方法发现相似的 QTL 位置,但在存在表型异常值的情况下,在检测真正的 QTL 和效应估计方面,它会产生更好的结果。我们提出了稳健 QTL 分析的广义形式,并通过实现这种稳健方法开发了一个名为“BetaCIM”的 R 包。使用 BetaCIM、CIM 和 IM 方法分析了小鼠杂交群体(129 S1/SvlmJ×A/J)的左、右肾重数据集。对于右肾重(RKW),CIM 和 BetaCIM 提供了相似的 LOD 评分分布,并且这两种方法都确定了 3 个 QTL 位置。IM 方法也确定了 3 个 QTL 位置。对于左肾重(LKW),存在一个异常观测值的证据;在这种情况下,BetaCIM 方法确定了 2 个 QTL 位置。然而,在 5%的显著水平下,CIM 和 IM 方法都没有发现这些 QTL 是显著的。基因表达本体论(GEO)搜索显示,确定的 LKW 数量性状基因座(QTL)的候选基因(Otof 和 A330033J07Rik)在肾脏中表达。模拟和真实数据分析结果均表明,在存在表型异常值的情况下,BetaCIM 方法优于现有方法。否则,它的性能几乎保持不变。