Center for Bioinformatics, Division of Systems Biology, National Center for Toxicological Research, US Food & Drug Administration, 3900 NCTR Rd, Jefferson, Arkansas, USA.
BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S3. doi: 10.1186/1471-2105-12-S10-S3.
Genomic biomarkers play an increasing role in both preclinical and clinical application. Development of genomic biomarkers with microarrays is an area of intensive investigation. However, despite sustained and continuing effort, developing microarray-based predictive models (i.e., genomics biomarkers) capable of reliable prediction for an observed or measured outcome (i.e., endpoint) of unknown samples in preclinical and clinical practice remains a considerable challenge. No straightforward guidelines exist for selecting a single model that will perform best when presented with unknown samples. In the second phase of the MicroArray Quality Control (MAQC-II) project, 36 analysis teams produced a large number of models for 13 preclinical and clinical endpoints. Before external validation was performed, each team nominated one model per endpoint (referred to here as 'nominated models') from which MAQC-II experts selected 13 'candidate models' to represent the best model for each endpoint. Both the nominated and candidate models from MAQC-II provide benchmarks to assess other methodologies for developing microarray-based predictive models.
We developed a simple ensemble method by taking a number of the top performing models from cross-validation and developing an ensemble model for each of the MAQC-II endpoints. We compared the ensemble models with both nominated and candidate models from MAQC-II using blinded external validation.
For 10 of the 13 MAQC-II endpoints originally analyzed by the MAQC-II data analysis team from the National Center for Toxicological Research (NCTR), the ensemble models achieved equal or better predictive performance than the NCTR nominated models. Additionally, the ensemble models had performance comparable to the MAQC-II candidate models. Most ensemble models also had better performance than the nominated models generated by five other MAQC-II data analysis teams that analyzed all 13 endpoints.
Our findings suggest that an ensemble method can often attain a higher average predictive performance in an external validation set than a corresponding "optimized" model method. Using an ensemble method to determine a final model is a potentially important supplement to the good modeling practices recommended by the MAQC-II project for developing microarray-based genomic biomarkers.
基因组生物标志物在临床前和临床应用中发挥着越来越重要的作用。利用微阵列开发基因组生物标志物是一个研究热点。然而,尽管持续不断地努力,开发能够可靠预测临床前和临床实践中未知样本观察或测量结果(即终点)的基于微阵列的预测模型(即基因组生物标志物)仍然是一个相当大的挑战。目前尚无明确的指导方针可以选择在面对未知样本时表现最佳的单一模型。在微阵列质量控制(MAQC-II)项目的第二阶段,36 个分析团队为 13 个临床前和临床终点生产了大量模型。在进行外部验证之前,每个团队都从每个终点提名一个模型(这里称为“提名模型”),MAQC-II 专家从中选择了 13 个“候选模型”,以代表每个终点的最佳模型。MAQC-II 的提名模型和候选模型都为评估开发基于微阵列的预测模型的其他方法提供了基准。
我们开发了一种简单的集成方法,从交叉验证中选择一些表现最佳的模型,并为 MAQC-II 的每个终点开发一个集成模型。我们使用盲法外部验证将集成模型与 MAQC-II 的提名模型和候选模型进行了比较。
在最初由国家毒理学研究中心(NCTR)的 MAQC-II 数据分析团队分析的 13 个 MAQC-II 终点中的 10 个中,集成模型的预测性能与 NCTR 提名模型相等或更好。此外,集成模型的性能与 MAQC-II 候选模型相当。大多数集成模型的性能也优于分析所有 13 个终点的其他 5 个 MAQC-II 数据分析团队生成的提名模型。
我们的研究结果表明,在外部验证集中,集成方法通常可以达到比相应的“优化”模型方法更高的平均预测性能。使用集成方法来确定最终模型是对 MAQC-II 项目推荐的用于开发基于微阵列的基因组生物标志物的良好建模实践的重要补充。