Department of Anatomy, Embryology & Physiology, Academic Medical Center, Meibergdreef 15, 1100AZ Amsterdam, The Netherlands.
Methods. 2013 Jan;59(1):32-46. doi: 10.1016/j.ymeth.2012.08.011. Epub 2012 Sep 3.
RNA transcripts such as mRNA or microRNA are frequently used as biomarkers to determine disease state or response to therapy. Reverse transcription (RT) in combination with quantitative PCR (qPCR) has become the method of choice to quantify small amounts of such RNA molecules. In parallel with the democratization of RT-qPCR and its increasing use in biomedical research or biomarker discovery, we witnessed a growth in the number of gene expression data analysis methods. Most of these methods are based on the principle that the position of the amplification curve with respect to the cycle-axis is a measure for the initial target quantity: the later the curve, the lower the target quantity. However, most methods differ in the mathematical algorithms used to determine this position, as well as in the way the efficiency of the PCR reaction (the fold increase of product per cycle) is determined and applied in the calculations. Moreover, there is dispute about whether the PCR efficiency is constant or continuously decreasing. Together this has lead to the development of different methods to analyze amplification curves. In published comparisons of these methods, available algorithms were typically applied in a restricted or outdated way, which does not do them justice. Therefore, we aimed at development of a framework for robust and unbiased assessment of curve analysis performance whereby various publicly available curve analysis methods were thoroughly compared using a previously published large clinical data set (Vermeulen et al., 2009) [11]. The original developers of these methods applied their algorithms and are co-author on this study. We assessed the curve analysis methods' impact on transcriptional biomarker identification in terms of expression level, statistical significance, and patient-classification accuracy. The concentration series per gene, together with data sets from unpublished technical performance experiments, were analyzed in order to assess the algorithms' precision, bias, and resolution. While large differences exist between methods when considering the technical performance experiments, most methods perform relatively well on the biomarker data. The data and the analysis results per method are made available to serve as benchmark for further development and evaluation of qPCR curve analysis methods (http://qPCRDataMethods.hfrc.nl).
RNA 转录物(如 mRNA 或 microRNA)常被用作生物标志物,以确定疾病状态或对治疗的反应。逆转录 (RT) 与定量 PCR (qPCR) 相结合已成为定量分析此类 RNA 分子的首选方法。随着 RT-qPCR 的普及及其在生物医学研究或生物标志物发现中的应用不断增加,我们见证了基因表达数据分析方法数量的增长。这些方法中的大多数基于这样一个原则,即扩增曲线相对于循环轴的位置是初始靶标数量的衡量标准:曲线越晚,靶标数量越低。然而,大多数方法在用于确定该位置的数学算法以及 PCR 反应效率(每个循环产物的倍数增加)的确定和应用方式上有所不同。此外,关于 PCR 效率是否恒定或持续降低存在争议。所有这些因素共同导致了不同分析扩增曲线方法的发展。在已发表的这些方法比较中,可用的算法通常以受限或过时的方式应用,这对它们不公平。因此,我们旨在开发一种稳健且无偏的评估曲线分析性能的框架,通过使用先前发表的大型临床数据集 (Vermeulen 等人,2009 年) [11] 彻底比较各种可用的曲线分析方法。这些方法的原始开发人员应用了他们的算法,并作为本研究的共同作者。我们根据表达水平、统计显著性和患者分类准确性来评估曲线分析方法对转录生物标志物识别的影响。为了评估算法的精度、偏差和分辨率,对每个基因的浓度系列以及未发表的技术性能实验数据进行了分析。虽然在考虑技术性能实验时,方法之间存在很大差异,但大多数方法在生物标志物数据上表现相对较好。该数据和每种方法的分析结果可供使用,作为进一步开发和评估 qPCR 曲线分析方法的基准 (http://qPCRDataMethods.hfrc.nl)。