《基因芯片质量控制（MAQC）-II 研究：基于基因芯片的预测模型的开发和验证的常见实践》。

The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models.

机构信息

National Center for Toxicological Research, US Food and Drug Administration, Jefferson, Arkansas, USA.

出版信息

Nat Biotechnol. 2010 Aug;28(8):827-38. doi: 10.1038/nbt.1665. Epub 2010 Jul 30.

DOI:10.1038/nbt.1665

PMID:20676074

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3315840/

Abstract

Gene expression data from microarrays are being applied to predict preclinical and clinical endpoints, but the reliability of these predictions has not been established. In the MAQC-II project, 36 independent teams analyzed six microarray data sets to generate predictive models for classifying a sample with respect to one of 13 endpoints indicative of lung or liver toxicity in rodents, or of breast cancer, multiple myeloma or neuroblastoma in humans. In total, >30,000 models were built using many combinations of analytical methods. The teams generated predictive models without knowing the biological meaning of some of the endpoints and, to mimic clinical reality, tested the models on data that had not been used for training. We found that model performance depended largely on the endpoint and team proficiency and that different approaches generated models of similar performance. The conclusions and recommendations from MAQC-II should be useful for regulatory agencies, study committees and independent investigators that evaluate methods for global gene expression analysis.

摘要

基因表达数据的微阵列正在被应用于预测临床前和临床终点，但这些预测的可靠性尚未确定。在 MAQC-II 项目中，36 个独立的团队分析了六个微阵列数据集，以生成预测模型，用于将样本分类为 13 个终点之一，这些终点指示了啮齿动物的肺或肝毒性，或人类的乳腺癌、多发性骨髓瘤或神经母细胞瘤。总共使用了许多分析方法的组合来构建超过 30000 个模型。团队在不知道一些终点的生物学意义的情况下生成了预测模型，并且为了模拟临床现实，在没有用于训练的数据上测试了模型。我们发现，模型性能在很大程度上取决于终点和团队的熟练程度，并且不同的方法生成了性能相似的模型。MAQC-II 的结论和建议应该对评估全球基因表达分析方法的监管机构、研究委员会和独立研究人员有用。

相似文献

The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models.

Nat Biotechnol. 2010 Aug;28(8):827-38. doi: 10.1038/nbt.1665. Epub 2010 Jul 30.

Selecting a single model or combining multiple models for microarray-based classifier development?--a comparative analysis based on large and diverse datasets generated from the MAQC-II project.

BMC Bioinformatics. 2011 Oct 18;12 Suppl 10(Suppl 10):S3. doi: 10.1186/1471-2105-12-S10-S3.

Feature selection and classification of MAQC-II breast cancer and multiple myeloma microarray gene expression data.

PLoS One. 2009 Dec 11;4(12):e8250. doi: 10.1371/journal.pone.0008250.

k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction.

Pharmacogenomics J. 2010 Aug;10(4):292-309. doi: 10.1038/tpj.2010.56.

Microarrays in the clinic.

Nat Biotechnol. 2010 Aug;28(8):810-2. doi: 10.1038/nbt0810-810.

Comparison of RNA-seq and microarray-based models for clinical endpoint prediction.

Genome Biol. 2015 Jun 25;16(1):133. doi: 10.1186/s13059-015-0694-1.

Evaluation of gene expression data generated from expired Affymetrix GeneChip® microarrays using MAQC reference RNA samples.

BMC Bioinformatics. 2010 Oct 7;11 Suppl 6(Suppl 6):S10. doi: 10.1186/1471-2105-11-S6-S10.

Cross-platform comparison of SYBR Green real-time PCR with TaqMan PCR, microarrays and other gene expression measurement technologies evaluated in the MicroArray Quality Control (MAQC) study.

BMC Genomics. 2008 Jul 11;9:328. doi: 10.1186/1471-2164-9-328.

Comparison of performance of one-color and two-color gene-expression analyses in predicting clinical endpoints of neuroblastoma patients.

Pharmacogenomics J. 2010 Aug;10(4):258-66. doi: 10.1038/tpj.2010.53.

Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data.

BMC Bioinformatics. 2007 Oct 25;8:412. doi: 10.1186/1471-2105-8-412.

引用本文的文献

Multi-omics Quality Assessment in Personalized Medicine Through European Infrastructure for Translational Medicine (EATRIS): An Overview.

Phenomics. 2025 Apr 1;5(3):311-325. doi: 10.1007/s43657-024-00170-0. eCollection 2025 Jun.

A prognostic model for multiple myeloma based on lipid metabolism related genes.

Zhong Nan Da Xue Xue Bao Yi Xue Ban. 2025 Apr 28;50(4):517-530. doi: 10.11817/j.issn.1672-7347.2025.240592.

The bone marrow immune ecosystem shapes daratumumab acquired resistance in plasma cell myeloma.

Leukemia. 2025 Aug 1. doi: 10.1038/s41375-025-02712-5.

A Monocyte-Driven Prognostic Model for Multiple Myeloma: Multi-Omics and Machine Learning Insights.

Blood Lymphat Cancer. 2025 Jun 16;15:21-37. doi: 10.2147/BLCTT.S517354. eCollection 2025.

Functional multiomics reveals genetic and pharmacologic regulation of surface CD38 in multiple myeloma.

Blood Neoplasia. 2024 Jun 10;1(3):100025. doi: 10.1016/j.bneo.2024.100025. eCollection 2024 Sep.

HABiC: an algorithm based on the exact computation of the Kantorovich-Rubinstein optimizer for binary classification in transcriptomics.

Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf310.

An updated comparison of microarray and RNA-seq for concentration response transcriptomic study: case studies with two cannabinoids, cannabichromene and cannabinol.

BMC Genomics. 2025 Apr 23;26(1):392. doi: 10.1186/s12864-025-11548-3.

Is regulatory science ready for artificial intelligence?

NPJ Digit Med. 2025 Apr 10;8(1):200. doi: 10.1038/s41746-025-01596-0.

Evaluating the three-level approach of the U-smile method for imbalanced binary classification.

PLoS One. 2025 Apr 10;20(4):e0321661. doi: 10.1371/journal.pone.0321661. eCollection 2025.

Individualized dynamic risk assessment and treatment selection for multiple myeloma.

Br J Cancer. 2025 Jun;132(10):922-936. doi: 10.1038/s41416-025-02987-6. Epub 2025 Apr 1.

本文引用的文献

Functional analysis of multiple genomic signatures demonstrates that classification algorithms choose phenotype-related genes.

Pharmacogenomics J. 2010 Aug;10(4):310-23. doi: 10.1038/tpj.2010.35.

k-Nearest neighbor models for microarray gene expression analysis and clinical outcome prediction.

Pharmacogenomics J. 2010 Aug;10(4):292-309. doi: 10.1038/tpj.2010.56.

A comparison of batch effect removal methods for enhancement of prediction performance using MAQC-II microarray gene expression data.

Pharmacogenomics J. 2010 Aug;10(4):278-91. doi: 10.1038/tpj.2010.57.

Genomic indicators in the blood predict drug-induced liver injury.

Pharmacogenomics J. 2010 Aug;10(4):267-77. doi: 10.1038/tpj.2010.33.

Comparison of performance of one-color and two-color gene-expression analyses in predicting clinical endpoints of neuroblastoma patients.

Pharmacogenomics J. 2010 Aug;10(4):258-66. doi: 10.1038/tpj.2010.53.

Consistency of predictive signature genes and classifiers generated using different microarray platforms.

Pharmacogenomics J. 2010 Aug;10(4):247-57. doi: 10.1038/tpj.2010.34.

Voluntary exploratory data submissions to the US FDA and the EMA: experience and impact.

Nat Rev Drug Discov. 2010 Jun;9(6):435-45. doi: 10.1038/nrd3116.

Assessing sources of inconsistencies in genotypes and their effects on genome-wide association studies with HapMap samples.

Pharmacogenomics J. 2010 Aug;10(4):364-74. doi: 10.1038/tpj.2010.24. Epub 2010 Apr 6.

Effect of training-sample size and classification difficulty on the accuracy of genomic predictors.

Breast Cancer Res. 2010;12(1):R5. doi: 10.1186/bcr2468. Epub 2010 Jan 11.

Prognostic gene signatures for non-small-cell lung cancer.

Proc Natl Acad Sci U S A. 2009 Feb 24;106(8):2824-8. doi: 10.1073/pnas.0809444106. Epub 2009 Feb 5.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

《基因芯片质量控制（MAQC）-II 研究：基于基因芯片的预测模型的开发和验证的常见实践》。

The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献