Suppr超能文献

概率基因型推断方法和集成加权套索用于 QTL 鉴定。

Probability genotype imputation method and integrated weighted lasso for QTL identification.

机构信息

Johann Bernoulli Institute for Mathematics and Computer Science, University of Groningen, Groningen 9747 AG, The Netherlands.

出版信息

BMC Genet. 2013 Dec 30;14:125. doi: 10.1186/1471-2156-14-125.

Abstract

BACKGROUND

Many QTL studies have two common features: (1) often there is missing marker information, (2) among many markers involved in the biological process only a few are causal. In statistics, the second issue falls under the headings "sparsity" and "causal inference". The goal of this work is to develop a two-step statistical methodology for QTL mapping for markers with binary genotypes. The first step introduces a novel imputation method for missing genotypes. Outcomes of the proposed imputation method are probabilities which serve as weights to the second step, namely in weighted lasso. The sparse phenotype inference is employed to select a set of predictive markers for the trait of interest.

RESULTS

Simulation studies validate the proposed methodology under a wide range of realistic settings. Furthermore, the methodology outperforms alternative imputation and variable selection methods in such studies. The methodology was applied to an Arabidopsis experiment, containing 69 markers for 165 recombinant inbred lines of a F8 generation. The results confirm previously identified regions, however several new markers are also found. On the basis of the inferred ROC behavior these markers show good potential for being real, especially for the germination trait Gmax.

CONCLUSIONS

Our imputation method shows higher accuracy in terms of sensitivity and specificity compared to alternative imputation method. Also, the proposed weighted lasso outperforms commonly practiced multiple regression as well as the traditional lasso and adaptive lasso with three weighting schemes. This means that under realistic missing data settings this methodology can be used for QTL identification.

摘要

背景

许多 QTL 研究具有两个共同特征:(1)经常存在缺失的标记信息,(2)在涉及生物过程的众多标记中,只有少数是因果关系的。在统计学中,第二个问题属于“稀疏性”和“因果推断”的范畴。这项工作的目标是开发一种用于具有二态基因型标记的 QTL 作图的两步统计方法。第一步引入了一种新的缺失基因型插补方法。所提出的插补方法的结果是概率,它们作为权重应用于第二步,即加权套索。稀疏表型推断用于选择一组与感兴趣性状相关的预测标记。

结果

模拟研究在广泛的现实设置下验证了所提出的方法。此外,在这些研究中,该方法优于替代的插补和变量选择方法。该方法应用于拟南芥实验,该实验包含 69 个标记,用于 F8 代的 165 个重组自交系。结果证实了先前确定的区域,但也发现了几个新的标记。基于推断的 ROC 行为,这些标记显示出很好的潜力,特别是对于发芽性状 Gmax。

结论

与替代插补方法相比,我们的插补方法在灵敏度和特异性方面表现出更高的准确性。此外,所提出的加权套索优于常用的多元回归以及传统的套索和自适应套索,具有三种加权方案。这意味着在现实的缺失数据环境下,该方法可用于 QTL 鉴定。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/32f5/4126192/f47a21cd926f/1471-2156-14-125-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验