Taylor Sandra L, Pollard Katherine S
Biostatistics Graduate Group, University of California-Davis, One Shields Avenue, Davis, CA 95616, USA.
Genet Res (Camb). 2010 Feb;92(1):39-53. doi: 10.1017/S0016672310000042. Epub 2010 Mar 3.
Increasingly researchers are conducting quantitative trait locus (QTL) mapping in metabolomics and proteomics studies. These data often are distributed as a point-mass mixture, consisting of a spike at zero in combination with continuous non-negative measurements. Composite interval mapping (CIM) is a common method used to map QTL that has been developed only for normally distributed or binary data. Here we propose a two-part CIM method for identifying QTLs when the phenotype is distributed as a point-mass mixture. We compare our new method with existing normal and binary CIM methods through an analysis of metabolomics data from Arabidopsis thaliana. We then conduct a simulation study to further understand the power and error rate of our two-part CIM method relative to normal and binary CIM methods. Our results show that the two-part CIM has greater power and a lower false positive rate than the other methods when a continuous phenotype is measured with many zero observations.
越来越多的研究人员正在代谢组学和蛋白质组学研究中进行数量性状基因座(QTL)定位。这些数据通常以点质量混合的形式分布,由零处的尖峰与连续的非负测量值组成。复合区间作图(CIM)是一种用于QTL定位的常用方法,该方法仅针对正态分布或二元数据开发。在此,我们提出了一种两部分CIM方法,用于在表型以点质量混合形式分布时识别QTL。我们通过对拟南芥代谢组学数据的分析,将我们的新方法与现有的正态和二元CIM方法进行比较。然后,我们进行了一项模拟研究,以进一步了解我们的两部分CIM方法相对于正态和二元CIM方法的功效和错误率。我们的结果表明,当连续表型测量中有许多零观测值时,两部分CIM方法比其他方法具有更高的功效和更低的假阳性率。