一种用于检测双基因疾病基因的多线程基因型模式挖掘方法。

A multi-threaded approach to genotype pattern mining for detecting digenic disease genes.

作者信息

Zhang Qingrun, Bhatia Muskan, Park Taesung, Ott Jurg

机构信息

Department of Mathematics and Statistics, University of Calgary, Calgary, AB, Canada.

Department of Biochemistry and Molecular Biology, University of Calgary, Calgary, AB, Canada.

出版信息

Front Genet. 2023 Aug 24;14:1222517. doi: 10.3389/fgene.2023.1222517. eCollection 2023.

DOI:10.3389/fgene.2023.1222517

PMID:37693313

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10483394/

Abstract

To locate disease-causing DNA variants on the human gene map, the customary approach has been to carry out a genome-wide association study for one variant after another by testing for genotype frequency differences between individuals affected and unaffected with disease. So-called digenic traits are due to the combined effects of two variants, often on different chromosomes, while individual variants may have little or no effect on disease. Machine learning approaches have been developed to find variant pairs underlying digenic traits. However, many of these methods have large memory requirements so that only small datasets can be analyzed. The increasing availability of desktop computers with large numbers of processors and suitable programming to distribute the workload evenly over all processors in a machine make a new and relatively straightforward approach possible, that is, to evaluate all existing variant and genotype pairs for disease association. We present a prototype of such a method with two components, and , and demonstrate its advantages over existing implementations of such well-known algorithms as and . We apply these methods to published case-control datasets on age-related macular degeneration and Parkinson disease and construct an ROC curve for a large set of genotype patterns.

摘要

为了在人类基因图谱上定位致病DNA变异，传统方法是通过检测患病个体和未患病个体之间的基因型频率差异，对一个又一个变异进行全基因组关联研究。所谓的双基因性状是由两个变异的联合效应导致的，这两个变异通常位于不同染色体上，而单个变异可能对疾病影响很小或没有影响。已经开发了机器学习方法来寻找双基因性状背后的变异对。然而，这些方法中的许多都有很大的内存需求，因此只能分析小数据集。随着具有大量处理器的台式计算机的可用性不断提高，以及合适的编程能够将工作负载均匀地分布在机器中的所有处理器上，一种新的、相对简单的方法成为可能，即评估所有现有的变异和基因型对与疾病的关联性。我们提出了一种具有两个组件的此类方法的原型，并展示了它相对于和等著名算法的现有实现的优势。我们将这些方法应用于已发表的年龄相关性黄斑变性和帕金森病的病例对照数据集，并为一大组基因型模式构建ROC曲线。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/963d/10483394/f1f50c1d3721/fgene-14-1222517-g001.jpg

相似文献

A multi-threaded approach to genotype pattern mining for detecting digenic disease genes.一种用于检测双基因疾病基因的多线程基因型模式挖掘方法。

Front Genet. 2023 Aug 24;14:1222517. doi: 10.3389/fgene.2023.1222517. eCollection 2023.

Digenic Analysis Finds Highly Interactive Genetic Variants Underlying Polygenic Traits.双基因分析发现多基因性状背后具有高度相互作用的遗传变异。

Med Res Arch. 2023 Oct;11(10). doi: 10.18103/mra.v11i10.4604. Epub 2023 Oct 30.

Genotype Pattern Mining for Pairs of Interacting Variants Underlying Digenic Traits.双基因性状相关互作变异对的基因型模式挖掘。

Genes (Basel). 2021 Jul 28;12(8):1160. doi: 10.3390/genes12081160.

Identifying digenic disease genes via machine learning in the Undiagnosed Diseases Network.通过机器学习在未确诊疾病网络中鉴定双基因疾病基因。

Am J Hum Genet. 2021 Oct 7;108(10):1946-1963. doi: 10.1016/j.ajhg.2021.08.010. Epub 2021 Sep 15.

Integrate multiple traits to detect novel trait-gene association using GWAS summary data with an adaptive test approach.利用 GWAS 汇总数据和自适应检验方法整合多种性状，以检测新的性状-基因关联。

Bioinformatics. 2019 Jul 1;35(13):2251-2257. doi: 10.1093/bioinformatics/bty961.

Genotype distribution-based inference of collective effects in genome-wide association studies: insights to age-related macular degeneration disease mechanism.基于基因型分布推断全基因组关联研究中的集体效应：对年龄相关性黄斑变性疾病机制的见解

BMC Genomics. 2016 Aug 30;17(1):695. doi: 10.1186/s12864-016-2871-3.

Detecting disease-associated genotype patterns.检测与疾病相关的基因型模式。

BMC Bioinformatics. 2009 Jan 30;10 Suppl 1(Suppl 1):S75. doi: 10.1186/1471-2105-10-S1-S75.

A genome-wide case-only test for the detection of digenic inheritance in human exomes.全基因组病例对照检验在人类外显子组中二基因遗传的检测。

Proc Natl Acad Sci U S A. 2020 Aug 11;117(32):19367-19375. doi: 10.1073/pnas.1920650117. Epub 2020 Jul 27.

MCKAT: a multi-dimensional copy number variant kernel association test.MCKAT：一种多维拷贝数变异核关联测试。

BMC Bioinformatics. 2021 Dec 11;22(1):588. doi: 10.1186/s12859-021-04494-w.

Whole exome sequencing of extreme age-related macular degeneration phenotypes.极端年龄相关性黄斑变性表型的全外显子组测序

Mol Vis. 2016 Aug 29;22:1062-76. eCollection 2016.

引用本文的文献

Digenic Analysis Finds Highly Interactive Genetic Variants Underlying Polygenic Traits.双基因分析发现多基因性状背后具有高度相互作用的遗传变异。

Med Res Arch. 2023 Oct;11(10). doi: 10.18103/mra.v11i10.4604. Epub 2023 Oct 30.

本文引用的文献

Overview of frequent pattern mining.频繁模式挖掘概述。

Genomics Inform. 2022 Dec;20(4):e39. doi: 10.5808/gi.22074. Epub 2022 Dec 30.

Machine learning approaches to explore digenic inheritance.机器学习方法探索双基因遗传。

Trends Genet. 2022 Oct;38(10):1013-1018. doi: 10.1016/j.tig.2022.04.009. Epub 2022 May 14.

Receiver operating characteristic curve: overview and practical use for clinicians.受试者工作特征曲线：概述与临床医师的实际应用

Korean J Anesthesiol. 2022 Feb;75(1):25-36. doi: 10.4097/kja.21209. Epub 2022 Jan 18.

Genotype Pattern Mining for Pairs of Interacting Variants Underlying Digenic Traits.双基因性状相关互作变异对的基因型模式挖掘。

Genes (Basel). 2021 Jul 28;12(8):1160. doi: 10.3390/genes12081160.

Genetics of schizophrenia (Review).精神分裂症的遗传学（综述）

Exp Ther Med. 2020 Oct;20(4):3462-3468. doi: 10.3892/etm.2020.8973. Epub 2020 Jul 7.

Predicting disease-causing variant combinations.预测致病变异组合。

Proc Natl Acad Sci U S A. 2019 Jun 11;116(24):11878-11887. doi: 10.1073/pnas.1815601116. Epub 2019 May 24.

Discovering Genetic Factors for psoriasis through exhaustively searching for significant second order SNP-SNP interactions.通过穷尽搜索显著的二阶 SNP-SNP 相互作用来发现银屑病的遗传因素。

Sci Rep. 2018 Oct 12;8(1):15186. doi: 10.1038/s41598-018-33493-w.

FAM107B is regulated by S100A4 and mediates the effect of S100A4 on the proliferation and migration of MGC803 gastric cancer cells.FAM107B受S100A4调控，并介导S100A4对MGC803胃癌细胞增殖和迁移的影响。

Cell Biol Int. 2017 Oct;41(10):1103-1109. doi: 10.1002/cbin.10816. Epub 2017 Aug 29.

CINOEDV: a co-information based method for detecting and visualizing n-order epistatic interactions.CINOEDV：一种基于互信息的n阶上位性相互作用检测与可视化方法。

BMC Bioinformatics. 2016 May 17;17(1):214. doi: 10.1186/s12859-016-1076-8.

Second-generation PLINK: rising to the challenge of larger and richer datasets.第二代PLINK：应对更大、更丰富数据集的挑战

Gigascience. 2015 Feb 25;4:7. doi: 10.1186/s13742-015-0047-8. eCollection 2015.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于检测双基因疾病基因的多线程基因型模式挖掘方法。

A multi-threaded approach to genotype pattern mining for detecting digenic disease genes.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献