充分利用 SNP 阵列：提取潜在基因组结构的工具的系统评价。

Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure.

机构信息

Bioinformatics Research Group in Epidemiology of ISGlobal.

出版信息

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac043.

DOI:10.1093/bib/bbac043

PMID:35211719

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8921734/

Abstract

Single nucleotide polymorphisms (SNPs) are the most abundant type of genomic variation and the most accessible to genotype in large cohorts. However, they individually explain a small proportion of phenotypic differences between individuals. Ancestry, collective SNP effects, structural variants, somatic mutations or even differences in historic recombination can potentially explain a high percentage of genomic divergence. These genetic differences can be infrequent or laborious to characterize; however, many of them leave distinctive marks on the SNPs across the genome allowing their study in large population samples. Consequently, several methods have been developed over the last decade to detect and analyze different genomic structures using SNP arrays, to complement genome-wide association studies and determine the contribution of these structures to explain the phenotypic differences between individuals. We present an up-to-date collection of available bioinformatics tools that can be used to extract relevant genomic information from SNP array data including population structure and ancestry; polygenic risk scores; identity-by-descent fragments; linkage disequilibrium; heritability and structural variants such as inversions, copy number variants, genetic mosaicisms and recombination histories. From a systematic review of recently published applications of the methods, we describe the main characteristics of R packages, command-line tools and desktop applications, both free and commercial, to help make the most of a large amount of publicly available SNP data.

摘要

单核苷酸多态性（SNPs）是基因组变异中最丰富的类型，也是在大样本中进行基因分型最容易的类型。然而，它们各自只能解释个体之间表型差异的一小部分。祖源、集体 SNP 效应、结构变异、体细胞突变，甚至历史上重组的差异，都可能解释基因组差异的很大一部分。这些遗传差异可能很少见或难以描述；然而，它们中的许多在基因组中的 SNP 上留下了独特的印记，允许在大的人群样本中研究它们。因此，在过去十年中，已经开发了几种方法来使用 SNP 阵列检测和分析不同的基因组结构，以补充全基因组关联研究，并确定这些结构对解释个体之间表型差异的贡献。我们提供了一份最新的可用生物信息学工具集合，这些工具可用于从 SNP 阵列数据中提取相关的基因组信息，包括群体结构和祖源；多基因风险评分；亲缘关系一致的片段；连锁不平衡；遗传力和结构变异，如倒位、拷贝数变异、遗传镶嵌和重组历史。通过对最近发表的这些方法应用的系统回顾，我们描述了 R 包、命令行工具和桌面应用程序的主要特征，包括免费和商业的，以帮助充分利用大量公开的 SNP 数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ad51/8921734/6874d41f6d8e/bbac043f1.jpg

相似文献

Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure.充分利用 SNP 阵列：提取潜在基因组结构的工具的系统评价。

Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbac043.

Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.基于在纯杜洛克群体中对低覆盖度全基因组序列变异体进行选择性连锁不平衡修剪的基因组预测。

Genet Sel Evol. 2023 Oct 18;55(1):72. doi: 10.1186/s12711-023-00843-w.

Genomic prediction using preselected DNA variants from a GWAS with whole-genome sequence data in Holstein-Friesian cattle.在荷斯坦-弗里生奶牛中，利用全基因组序列数据，从全基因组关联研究（GWAS）中预先选择的DNA变异进行基因组预测。

Genet Sel Evol. 2016 Dec 1;48(1):95. doi: 10.1186/s12711-016-0274-1.

Genome-wide association study and prediction of genomic breeding values for fatty-acid composition in Korean Hanwoo cattle using a high-density single-nucleotide polymorphism array.全基因组关联研究和利用高密度单核苷酸多态性芯片预测韩牛脂肪酸组成的基因组育种值。

J Anim Sci. 2018 Sep 29;96(10):4063-4075. doi: 10.1093/jas/sky280.

Snat: a SNP annotation tool for bovine by integrating various sources of genomic information.Snat：一个整合了多种基因组信息的牛 SNP 注释工具。

BMC Genet. 2011 Oct 7;12:85. doi: 10.1186/1471-2156-12-85.

Genetic variants associated with idiopathic pulmonary fibrosis susceptibility and mortality: a genome-wide association study.与特发性肺纤维化易感性和死亡率相关的遗传变异：全基因组关联研究。

Lancet Respir Med. 2013 Jun;1(4):309-317. doi: 10.1016/S2213-2600(13)70045-6. Epub 2013 Apr 17.

Identification and validation of copy number variants using SNP genotyping arrays from a large clinical cohort.利用大型临床队列中的 SNP 基因分型阵列鉴定和验证拷贝数变异。

BMC Genomics. 2012 Jun 15;13:241. doi: 10.1186/1471-2164-13-241.

Exploiting Linkage Disequilibrium for Ultrahigh-Dimensional Genome-Wide Data with an Integrated Statistical Approach.利用连锁不平衡和综合统计方法处理超高维全基因组数据

Genetics. 2016 Feb;202(2):411-26. doi: 10.1534/genetics.115.179507. Epub 2015 Dec 12.

Development and Validation of a High-Density SNP Genotyping Array for African Oil Palm.开发和验证一种用于非洲油棕的高密度 SNP 基因分型阵列。

Mol Plant. 2016 Aug 1;9(8):1132-1141. doi: 10.1016/j.molp.2016.04.010. Epub 2016 Apr 22.

Assessment of linkage disequilibrium patterns between structural variants and single nucleotide polymorphisms in three commercial chicken populations.评估三个商业鸡群中结构变异与单核苷酸多态性之间的连锁不平衡模式。

BMC Genomics. 2022 Mar 9;23(1):193. doi: 10.1186/s12864-022-08418-7.

引用本文的文献

Influence of ADRB2 variants on bronchodilator response and asthma control in a mixed population.ADRB2基因变异对混合人群支气管扩张剂反应及哮喘控制的影响。

J Bras Pneumol. 2025 Aug 18;51(4):e20250066. doi: 10.36416/1806-3756/e20250066. eCollection 2025.

Exploring chromosomal variations in garden roses: Insights from high-density SNP array data and a new tool, Qploidy.探索花园玫瑰中的染色体变异：来自高密度SNP阵列数据和新工具Qploidy的见解

Plant Genome. 2025 Jun;18(2):e70044. doi: 10.1002/tpg2.70044.

Investigating the Performance of Oxford Nanopore Long-Read Sequencing with Respect to Illumina Microarrays and Short-Read Sequencing.研究牛津纳米孔长读长测序相对于Illumina微阵列和短读长测序的性能。

Int J Mol Sci. 2025 May 8;26(10):4492. doi: 10.3390/ijms26104492.

Prediction of Skin Color Using Forensic DNA Phenotyping in Asian Populations: A Focus on Thailand.在亚洲人群中使用法医DNA表型分析预测肤色：以泰国为重点

Biomolecules. 2025 Apr 9;15(4):548. doi: 10.3390/biom15040548.

High performance imputation of structural and single nucleotide variants using low-coverage whole genome sequencing.利用低覆盖度全基因组测序对结构变异和单核苷酸变异进行高性能插补

Genet Sel Evol. 2025 Mar 28;57(1):16. doi: 10.1186/s12711-025-00962-6.

A High-Throughput Screening Strategy for Producing Menaquinone-7 Based on Fluorescence-Activated Cell Sorting.一种基于荧光激活细胞分选技术生产甲萘醌-7的高通量筛选策略。

Microorganisms. 2025 Feb 27;13(3):536. doi: 10.3390/microorganisms13030536.

DPImpute: A Genotype Imputation Framework for Ultra-Low Coverage Whole-Genome Sequencing and its Application in Genomic Selection.DPImpute：一种用于超低覆盖度全基因组测序的基因型填充框架及其在基因组选择中的应用。

Adv Sci (Weinh). 2025 Apr;12(16):e2412482. doi: 10.1002/advs.202412482. Epub 2025 Feb 27.

Uniparental disomy (UPD) exclusion in embryos following Preimplantation Genetic Testing for Structural Rearrangements (PGT-SR).胚胎植入前遗传学检测结构重排（PGT-SR）后单倍体二倍体（UPD）排除。

J Assist Reprod Genet. 2025 Jan;42(1):265-273. doi: 10.1007/s10815-024-03352-x. Epub 2024 Dec 18.

Asthma-Genomic Advances Toward Risk Prediction.哮喘-基因组学在风险预测方面的进展。

Clin Chest Med. 2024 Sep;45(3):599-610. doi: 10.1016/j.ccm.2024.03.002. Epub 2024 Apr 21.

Genome-wide association study between copy number variation and feeding behavior, feed efficiency, and growth traits in Nellore cattle.全基因组关联研究在尼里-拉菲水牛的数量性状、摄食行为、饲料效率和生长性状之间的关系。

BMC Genomics. 2024 Jan 11;25(1):54. doi: 10.1186/s12864-024-09976-8.

本文引用的文献

A comparative analysis of current phasing and imputation software.当前相位分析和插补软件的比较分析。

PLoS One. 2022 Oct 19;17(10):e0260177. doi: 10.1371/journal.pone.0260177. eCollection 2022.

Early Life Adversity and Polygenic Risk for High Fasting Insulin Are Associated With Childhood Impulsivity.早年生活逆境与高空腹胰岛素的多基因风险与儿童冲动性相关。

Front Neurosci. 2021 Sep 1;15:704785. doi: 10.3389/fnins.2021.704785. eCollection 2021.

Identification of pleiotropy at the gene level between psychiatric disorders and related traits.鉴定精神障碍及其相关特征在基因水平上的多效性。

Transl Psychiatry. 2021 Jul 29;11(1):410. doi: 10.1038/s41398-021-01530-4.

New Polygenic Risk Score to Predict High Myopia in Singapore Chinese Children.新的多基因风险评分可预测新加坡华裔儿童的高度近视。

Transl Vis Sci Technol. 2021 Jul 1;10(8):26. doi: 10.1167/tvst.10.8.26.

Identifying individuals with high risk of Alzheimer's disease using polygenic risk scores.使用多基因风险评分识别阿尔茨海默病高危个体。

Nat Commun. 2021 Jul 23;12(1):4506. doi: 10.1038/s41467-021-24082-z.

Heterogeneous effects of genetic risk for Alzheimer's disease on the phenome.阿尔茨海默病遗传风险对表型的异质性影响。

Transl Psychiatry. 2021 Jul 23;11(1):406. doi: 10.1038/s41398-021-01518-0.

Prognostic impact of pre-transplant chromosomal aberrations in peripheral blood of patients undergoing unrelated donor hematopoietic cell transplant for acute myeloid leukemia.无关供者造血干细胞移植治疗急性髓系白血病患者移植前外周血染色体异常对预后的影响。

Sci Rep. 2021 Jul 22;11(1):15004. doi: 10.1038/s41598-021-94539-0.

Polygenic risk scoring to assess genetic overlap and protective factors influencing posttraumatic stress, depression, and chronic pain after motor vehicle collision trauma.多基因风险评分评估遗传重叠和保护因素对机动车碰撞创伤后创伤后应激障碍、抑郁和慢性疼痛的影响。

Transl Psychiatry. 2021 Jun 29;11(1):359. doi: 10.1038/s41398-021-01486-5.

Polygenic risk for neuroticism moderates response to gains and losses in amygdala and caudate: Evidence from a clinical cohort.神经质的多基因风险可调节杏仁核和尾状核对得失的反应：来自临床队列的证据。

J Affect Disord. 2021 Oct 1;293:124-132. doi: 10.1016/j.jad.2021.06.016. Epub 2021 Jun 18.

Summix: A method for detecting and adjusting for population structure in genetic summary data.Summix：一种用于检测和调整遗传汇总数据中群体结构的方法。

Am J Hum Genet. 2021 Jul 1;108(7):1270-1282. doi: 10.1016/j.ajhg.2021.05.016. Epub 2021 Jun 21.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

充分利用 SNP 阵列：提取潜在基因组结构的工具的系统评价。

Fully exploiting SNP arrays: a systematic review on the tools to extract underlying genomic structure.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献