• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于预测误差的双样本比较及其在候选基因关联研究中的应用。

Two-sample comparison based on prediction error, with applications to candidate gene association studies.

作者信息

Yu K, Martin R, Rothman N, Zheng T, Lan Q

机构信息

Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Bethesda, MD 20892, USA.

出版信息

Ann Hum Genet. 2007 Jan;71(Pt 1):107-18. doi: 10.1111/j.1469-1809.2006.00306.x.

DOI:10.1111/j.1469-1809.2006.00306.x
PMID:17227481
Abstract

To take advantage of the increasingly available high-density SNP maps across the genome, various tests that compare multilocus genotypes or estimated haplotypes between cases and controls have been developed for candidate gene association studies. Here we view this two-sample testing problem from the perspective of supervised machine learning and propose a new association test. The approach adopts the flexible and easy-to-understand classification tree model as the learning machine, and uses the estimated prediction error of the resulting prediction rule as the test statistic. This procedure not only provides an association test but also generates a prediction rule that can be useful in understanding the mechanisms underlying complex disease. Under the set-up of a haplotype-based transmission/disequilibrium test (TDT) type of analysis, we find through simulation studies that the proposed procedure has the correct type I error rates and is robust to population stratification. The power of the proposed procedure is sensitive to the chosen prediction error estimator. Among commonly used prediction error estimators, the .632+ estimator results in a test that has the best overall performance. We also find that the test using the .632+ estimator is more powerful than the standard single-point TDT analysis, the Pearson's goodness-of-fit test based on estimated haplotype frequencies, and two haplotype-based global tests implemented in the genetic analysis package FBAT. To illustrate the application of the proposed method in population-based association studies, we use the procedure to study the association between non-Hodgkin lymphoma and the IL10 gene.

摘要

为了利用全基因组中日益可得的高密度单核苷酸多态性(SNP)图谱,针对候选基因关联研究,已开发出各种比较病例组和对照组多位点基因型或估计单倍型的检验方法。在此,我们从监督机器学习的角度审视这个两样本检验问题,并提出一种新的关联检验方法。该方法采用灵活且易于理解的分类树模型作为学习机器,并将所得预测规则的估计预测误差用作检验统计量。此过程不仅提供了一种关联检验,还生成了一个预测规则,这对于理解复杂疾病的潜在机制可能是有用的。在基于单倍型的传递/不平衡检验(TDT)类型的分析设置下,我们通过模拟研究发现,所提出的方法具有正确的I型错误率,并且对群体分层具有稳健性。所提出方法的功效对所选的预测误差估计器敏感。在常用的预测误差估计器中,.632 +估计器导致的检验具有最佳的整体性能。我们还发现,使用.632 +估计器的检验比标准的单点TDT分析、基于估计单倍型频率的Pearson拟合优度检验以及遗传分析软件包FBAT中实现的两种基于单倍型的全局检验更具功效。为了说明所提出方法在基于人群的关联研究中的应用,我们使用该方法研究非霍奇金淋巴瘤与IL10基因之间的关联。

相似文献

1
Two-sample comparison based on prediction error, with applications to candidate gene association studies.基于预测误差的双样本比较及其在候选基因关联研究中的应用。
Ann Hum Genet. 2007 Jan;71(Pt 1):107-18. doi: 10.1111/j.1469-1809.2006.00306.x.
2
Using tree-based recursive partitioning methods to group haplotypes for increased power in association studies.使用基于树的递归划分方法对单倍型进行分组,以提高关联研究的效能。
Ann Hum Genet. 2005 Sep;69(Pt 5):577-89. doi: 10.1111/j.1529-8817.2005.00193.x.
3
Tests of association between quantitative traits and haplotypes in a reduced-dimensional space.数量性状与降维空间中单体型之间的关联测试。
Ann Hum Genet. 2005 Nov;69(Pt 6):715-32. doi: 10.1111/j.1529-8817.2005.00216.x.
4
A haplotype similarity based transmission/disequilibrium test under founder heterogeneity.基于单倍型相似性的奠基者异质性下的传递/不平衡检验。
Ann Hum Genet. 2005 Jul;69(Pt 4):455-67. doi: 10.1046/j.1529-8817.2005.00168.x.
5
Resampling-based multiple hypothesis testing procedures for genetic case-control association studies.基于重采样的遗传病例对照关联研究多重假设检验程序。
Genet Epidemiol. 2006 Sep;30(6):495-507. doi: 10.1002/gepi.20162.
6
Haplotype uncertainty in association studies.关联研究中的单倍型不确定性。
Genet Epidemiol. 2007 May;31(4):348-57. doi: 10.1002/gepi.20215.
7
A new association test using haplotype similarity.一种使用单倍型相似性的新型关联测试。
Genet Epidemiol. 2007 Sep;31(6):577-93. doi: 10.1002/gepi.20230.
8
Haplotype sharing transmission/disequilibrium tests that allow for genotyping errors.允许存在基因分型错误的单倍型共享传递/不平衡检验。
Genet Epidemiol. 2005 May;28(4):341-51. doi: 10.1002/gepi.20066.
9
Adaptive transmission disequilibrium test for family trio design.适用于三联体家系设计的适应性传递不平衡检验。
Stat Appl Genet Mol Biol. 2009;8:Article30. doi: 10.2202/1544-6115.1451. Epub 2009 Jun 23.
10
Global transmission/disequilibrium tests based on haplotype sharing in multiple candidate genes.基于多个候选基因中单体型共享的全球传递/不平衡检验。
Genet Epidemiol. 2005 Dec;29(4):323-35. doi: 10.1002/gepi.20102.

引用本文的文献

1
Better-than-chance classification for signal detection.信号检测中优于随机概率的分类。
Biostatistics. 2021 Apr 10;22(2):365-380. doi: 10.1093/biostatistics/kxz035.
2
A fast and powerful tree-based association test for detecting complex joint effects in case-control studies.一种快速而强大的基于树的关联测试方法,用于检测病例对照研究中的复杂联合效应。
Bioinformatics. 2014 Aug 1;30(15):2171-8. doi: 10.1093/bioinformatics/btu186. Epub 2014 Apr 9.
3
The future of primary intraocular lymphoma (retinal lymphoma).原发性眼内淋巴瘤(视网膜淋巴瘤)的未来。
Ocul Immunol Inflamm. 2009 Nov-Dec;17(6):375-9. doi: 10.3109/09273940903434804.
4
A partially linear tree-based regression model for multivariate outcomes.一种用于多变量结果的基于树的部分线性回归模型。
Biometrics. 2010 Mar;66(1):89-96. doi: 10.1111/j.1541-0420.2009.01235.x. Epub 2009 May 7.