• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用遗传编程检测单核苷酸多态性的高阶相互作用。

Detecting high-order interactions of single nucleotide polymorphisms using genetic programming.

作者信息

Nunkesser Robin, Bernholt Thorsten, Schwender Holger, Ickstadt Katja, Wegener Ingo

机构信息

Collaborative Research Center 475, Department of Computer Science, University of Dortmund, Dortmund, Germany.

出版信息

Bioinformatics. 2007 Dec 15;23(24):3280-8. doi: 10.1093/bioinformatics/btm522. Epub 2007 Nov 15.

DOI:10.1093/bioinformatics/btm522
PMID:18006552
Abstract

MOTIVATION

Not individual single nucleotide polymorphisms (SNPs), but high-order interactions of SNPs are assumed to be responsible for complex diseases such as cancer. Therefore, one of the major goals of genetic association studies concerned with such genotype data is the identification of these high-order interactions. This search is additionally impeded by the fact that these interactions often are only explanatory for a relatively small subgroup of patients. Most of the feature selection methods proposed in the literature, unfortunately, fail at this task, since they can either only identify individual variables or interactions of a low order, or try to find rules that are explanatory for a high percentage of the observations. In this article, we present a procedure based on genetic programming and multi-valued logic that enables the identification of high-order interactions of categorical variables such as SNPs. This method called GPAS cannot only be used for feature selection, but can also be employed for discrimination.

RESULTS

In an application to the genotype data from the GENICA study, an association study concerned with sporadic breast cancer, GPAS is able to identify high-order interactions of SNPs leading to a considerably increased breast cancer risk for different subsets of patients that are not found by other feature selection methods. As an application to a subset of the HapMap data shows, GPAS is not restricted to association studies comprising several 10 SNPs, but can also be employed to analyze whole-genome data.

AVAILABILITY

Software can be downloaded from http://ls2-www.cs.uni-dortmund.de/~nunkesser/#Software

摘要

动机

人们认为,不是单个单核苷酸多态性(SNP),而是SNP的高阶相互作用才是导致诸如癌症等复杂疾病的原因。因此,涉及此类基因型数据的基因关联研究的主要目标之一就是识别这些高阶相互作用。此外,这些相互作用通常仅对相对较小的患者亚组具有解释性,这一事实阻碍了此类研究。不幸的是,文献中提出的大多数特征选择方法都无法完成这项任务,因为它们要么只能识别单个变量或低阶相互作用,要么试图找到对高比例观察结果具有解释性的规则。在本文中,我们提出了一种基于遗传编程和多值逻辑的方法,该方法能够识别诸如SNP等分类变量的高阶相互作用。这种称为GPAS的方法不仅可用于特征选择,还可用于判别分析。

结果

在对GENICA研究中的基因型数据(一项关于散发性乳腺癌的关联研究)的应用中,GPAS能够识别SNP的高阶相互作用,这些相互作用会使不同患者亚组的乳腺癌风险显著增加,而其他特征选择方法并未发现这些相互作用。正如对HapMap数据子集的应用所示,GPAS不仅限于包含数十个SNP的关联研究,还可用于分析全基因组数据。

可用性

软件可从http://ls2-www.cs.uni-dortmund.de/~nunkesser/#Software下载

相似文献

1
Detecting high-order interactions of single nucleotide polymorphisms using genetic programming.使用遗传编程检测单核苷酸多态性的高阶相互作用。
Bioinformatics. 2007 Dec 15;23(24):3280-8. doi: 10.1093/bioinformatics/btm522. Epub 2007 Nov 15.
2
Identification of SNP interactions using logic regression.使用逻辑回归识别单核苷酸多态性(SNP)相互作用。
Biostatistics. 2008 Jan;9(1):187-98. doi: 10.1093/biostatistics/kxm024. Epub 2007 Jun 19.
3
BNTagger: improved tagging SNP selection using Bayesian networks.BNTagger:使用贝叶斯网络改进标签单核苷酸多态性选择
Bioinformatics. 2006 Jul 15;22(14):e211-9. doi: 10.1093/bioinformatics/btl233.
4
Conditional likelihood methods for haplotype-based association analysis using matched case-control data.使用匹配病例对照数据进行基于单倍型的关联分析的条件似然方法。
Biometrics. 2007 Dec;63(4):1099-107. doi: 10.1111/j.1541-0420.2007.00797.x.
5
MLR-tagging: informative SNP selection for unphased genotypes based on multiple linear regression.MLR标签法:基于多元线性回归的未分型基因型信息性单核苷酸多态性选择
Bioinformatics. 2006 Oct 15;22(20):2558-61. doi: 10.1093/bioinformatics/btl420. Epub 2006 Aug 7.
6
Identification and structural comparison of deleterious mutations in nsSNPs of ABL1 gene in chronic myeloid leukemia: a bio-informatics study.慢性髓性白血病中ABL1基因nsSNPs有害突变的鉴定与结构比较:一项生物信息学研究
J Biomed Inform. 2008 Aug;41(4):607-12. doi: 10.1016/j.jbi.2007.12.004. Epub 2007 Dec 31.
7
Do BRCA1 modifiers also affect the risk of breast cancer in non-carriers?BRCA1修饰因子是否也会影响非携带者患乳腺癌的风险?
Eur J Cancer. 2009 Mar;45(5):837-42. doi: 10.1016/j.ejca.2008.10.021. Epub 2008 Dec 13.
8
A new method for SNP discovery.一种发现单核苷酸多态性的新方法。
Biotechniques. 2009 Mar;46(3):201-8. doi: 10.2144/000113075.
9
A greedier approach for finding tag SNPs.一种寻找标签单核苷酸多态性(tag SNPs)的更贪婪的方法。
Bioinformatics. 2006 Mar 15;22(6):685-91. doi: 10.1093/bioinformatics/btk035. Epub 2006 Jan 10.
10
Computational identification of candidate loci for recessively inherited mutation using high-throughput SNP arrays.利用高通量单核苷酸多态性(SNP)阵列对隐性遗传突变候选基因座进行计算识别。
Bioinformatics. 2007 Aug 1;23(15):1952-61. doi: 10.1093/bioinformatics/btm263. Epub 2007 May 17.

引用本文的文献

1
Potential application of elastic nets for shared polygenicity detection with adapted threshold selection.弹性网络在具有自适应阈值选择的共享多基因性检测中的潜在应用。
Int J Biostat. 2022 Nov 3;19(2):417-438. doi: 10.1515/ijb-2020-0108. eCollection 2023 Nov 1.
2
Genetic interactions effects for cancer disease identification using computational models: a review.基于计算模型的癌症疾病识别的遗传交互作用效应:综述。
Med Biol Eng Comput. 2021 Apr;59(4):733-758. doi: 10.1007/s11517-021-02343-9. Epub 2021 Apr 11.
3
Deep mixed model for marginal epistasis detection and population stratification correction in genome-wide association studies.
用于全基因组关联研究中边缘上位性检测和群体分层校正的深度混合模型。
BMC Bioinformatics. 2019 Dec 27;20(Suppl 23):656. doi: 10.1186/s12859-019-3300-9.
4
Detecting gene-gene interactions for complex quantitative traits using generalized fuzzy classification.使用广义模糊分类检测复杂数量性状的基因-基因相互作用。
BMC Bioinformatics. 2018 Sep 18;19(1):329. doi: 10.1186/s12859-018-2361-5.
5
Logic models to predict continuous outputs based on binary inputs with an application to personalized cancer therapy.基于二进制输入预测连续输出的逻辑模型及其在个性化癌症治疗中的应用。
Sci Rep. 2016 Nov 23;6:36812. doi: 10.1038/srep36812.
6
Determination of sample size for a multi-class classifier based on single-nucleotide polymorphisms: a volume under the surface approach.基于单核苷酸多态性的多类分类器样本量确定:一种曲面下体积法。
BMC Bioinformatics. 2014 Jun 14;15:190. doi: 10.1186/1471-2105-15-190.
7
Construction of gene clusters resembling genetic causal mechanisms for common complex disease with an application to young-onset hypertension.构建类似于常见复杂疾病遗传因果机制的基因簇及其在早发性高血压中的应用。
BMC Genomics. 2013 Jul 23;14:497. doi: 10.1186/1471-2164-14-497.
8
Statistical epistasis networks reduce the computational complexity of searching three-locus genetic models.统计上位性网络降低了搜索三位点遗传模型的计算复杂性。
Pac Symp Biocomput. 2013:397-408.
9
A Polygenic Approach to the Study 
of Polygenic Diseases.多基因疾病研究的多基因方法。
Acta Naturae. 2012 Jul;4(3):59-71.
10
The GA and the GWAS: using genetic algorithms to search for multilocus associations.GA 和 GWAS:使用遗传算法搜索多基因座关联。
IEEE/ACM Trans Comput Biol Bioinform. 2012 May-Jun;9(3):899-910. doi: 10.1109/TCBB.2011.145. Epub 2011 Oct 19.