Suppr超能文献

回顾性抽样对多因素降维预测误差估计的影响。

The effect of retrospective sampling on estimates of prediction error for multifactor dimensionality reduction.

作者信息

Winham Stacey J, Motsinger-Reif Alison A

机构信息

Department of Statistics, North Carolina State University, Raleigh, 27695, USA.

出版信息

Ann Hum Genet. 2011 Jan;75(1):46-61. doi: 10.1111/j.1469-1809.2010.00587.x.

Abstract

The standard in genetic association studies of complex diseases is replication and validation of positive results, with an emphasis on assessing the predictive value of associations. In response to this need, a number of analytical approaches have been developed to identify predictive models that account for complex genetic etiologies. Multifactor Dimensionality Reduction (MDR) is a commonly used, highly successful method designed to evaluate potential gene-gene interactions. MDR relies on classification error in a cross-validation framework to rank and evaluate potentially predictive models. Previous work has demonstrated the high power of MDR, but has not considered the accuracy and variance of the MDR prediction error estimate. Currently, we evaluate the bias and variance of the MDR error estimate as both a retrospective and prospective estimator and show that MDR can both underestimate and overestimate error. We argue that a prospective error estimate is necessary if MDR models are used for prediction, and propose a bootstrap resampling estimate, integrating population prevalence, to accurately estimate prospective error. We demonstrate that this bootstrap estimate is preferable for prediction to the error estimate currently produced by MDR. While demonstrated with MDR, the proposed estimation is applicable to all data-mining methods that use similar estimates.

摘要

复杂疾病基因关联研究的标准是对阳性结果进行重复验证,并着重评估关联的预测价值。为满足这一需求,已开发出多种分析方法来识别能解释复杂遗传病因的预测模型。多因素降维法(MDR)是一种常用且非常成功的方法,旨在评估潜在的基因-基因相互作用。MDR在交叉验证框架中依靠分类错误来对潜在的预测模型进行排序和评估。先前的研究已证明MDR具有强大的功效,但尚未考虑MDR预测误差估计的准确性和方差。目前,我们将MDR误差估计作为回顾性和前瞻性估计器来评估其偏差和方差,并表明MDR既能低估也能高估误差。我们认为,如果将MDR模型用于预测,前瞻性误差估计是必要的,并提出一种整合人群患病率的自助重采样估计方法,以准确估计前瞻性误差。我们证明,对于预测而言,这种自助估计比MDR目前产生的误差估计更可取。虽然以MDR为例进行了说明,但所提出的估计方法适用于所有使用类似估计的数据挖掘方法。

相似文献

1
The effect of retrospective sampling on estimates of prediction error for multifactor dimensionality reduction.
Ann Hum Genet. 2011 Jan;75(1):46-61. doi: 10.1111/j.1469-1809.2010.00587.x.
2
An empirical fuzzy multifactor dimensionality reduction method for detecting gene-gene interactions.
BMC Genomics. 2017 Mar 14;18(Suppl 2):115. doi: 10.1186/s12864-017-3496-x.
3
Model-based multifactor dimensionality reduction for detecting epistasis in case-control data in the presence of noise.
Ann Hum Genet. 2011 Jan;75(1):78-89. doi: 10.1111/j.1469-1809.2010.00604.x. Epub 2010 Sep 8.
7
A unified model based multifactor dimensionality reduction framework for detecting gene-gene interactions.
Bioinformatics. 2016 Sep 1;32(17):i605-i610. doi: 10.1093/bioinformatics/btw424.
8
A Belief Degree-Associated Fuzzy Multifactor Dimensionality Reduction Framework for Epistasis Detection.
Methods Mol Biol. 2021;2212:307-323. doi: 10.1007/978-1-0716-0947-7_19.

引用本文的文献

1
A roadmap to multifactor dimensionality reduction methods.
Brief Bioinform. 2016 Mar;17(2):293-308. doi: 10.1093/bib/bbv038. Epub 2015 Jun 24.
2
An R package implementation of multifactor dimensionality reduction.
BioData Min. 2011 Aug 16;4(1):24. doi: 10.1186/1756-0381-4-24.

本文引用的文献

2
New evaluation measures for multifactor dimensionality reduction classifiers in gene-gene interaction analysis.
Bioinformatics. 2009 Feb 1;25(3):338-45. doi: 10.1093/bioinformatics/btn629. Epub 2009 Jan 22.
3
Genetic mapping in human disease.
Science. 2008 Nov 7;322(5903):881-8. doi: 10.1126/science.1156409.
4
Interaction between interleukin 3 and dystrobrevin-binding protein 1 in schizophrenia.
Schizophr Res. 2008 Dec;106(2-3):208-17. doi: 10.1016/j.schres.2008.07.022. Epub 2008 Sep 18.
5
A comparison of analytical methods for genetic association studies.
Genet Epidemiol. 2008 Dec;32(8):767-78. doi: 10.1002/gepi.20345.
8
ABCB1 and GST polymorphisms associated with TP53 status in breast cancer.
Pharmacogenet Genomics. 2007 Feb;17(2):127-36. doi: 10.1097/FPC.0b013e328011abaa.
10
Multilocus genetic interactions and response to efavirenz-containing regimens: an adult AIDS clinical trials group study.
Pharmacogenet Genomics. 2006 Nov;16(11):837-45. doi: 10.1097/01.fpc.0000230413.97596.fa.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验