• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种提高SNP阵列中罕见变异检测准确性的新型质量控制程序。

A Novel Quality-Control Procedure to Improve the Accuracy of Rare Variant Calling in SNP Arrays.

作者信息

Sun Ting-Hsuan, Shao Yu-Hsuan Joni, Mao Chien-Lin, Hung Miao-Neng, Lo Yi-Yun, Ko Tai-Ming, Hsiao Tzu-Hung

机构信息

Department of Biological Science and Technology, National Yang Ming Chiao Tung University, Hsinchu, Taiwan.

Department of Medical Research, Taichung Veterans General Hospital, Taichung, Taiwan.

出版信息

Front Genet. 2021 Oct 26;12:736390. doi: 10.3389/fgene.2021.736390. eCollection 2021.

DOI:10.3389/fgene.2021.736390
PMID:34764980
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8577504/
Abstract

Single-nucleotide polymorphism (SNP) arrays are an ideal technology for genotyping genetic variants in mass screening. However, using SNP arrays to detect rare variants [with a minor allele frequency (MAF) of <1%] is still a challenge because of noise signals and batch effects. An approach that improves the genotyping quality is needed for clinical applications. We developed a quality-control procedure for rare variants which integrates different algorithms, filters, and experiments to increase the accuracy of variant calling. Using data from the TWB 2.0 custom Axiom array, we adopted an advanced normalization adjustment to prevent false calls caused by splitting the cluster and a rare het adjustment which decreases false calls in rare variants. The concordance of allelic frequencies from array data was compared to those from sequencing datasets of Taiwanese. Finally, genotyping results were used to detect familial hypercholesterolemia (FH), thrombophilia (TH), and maturity-onset diabetes of the young (MODY) to assess the performance in disease screening. All heterozygous calls were verified by Sanger sequencing or qPCR. The positive predictive value (PPV) of each step was estimated to evaluate the performance of our procedure. We analyzed SNP array data from 43,433 individuals, which interrogated 267,247 rare variants. The advanced normalization and rare het adjustment methods adjusted genotyping calling of 168,134 variants (96.49%). We further removed 3916 probesets which were discordant in MAFs between the SNP array and sequencing data. The PPV for detecting pathogenic variants with 0.01%<MAF≤1% exceeded 99.37%. PPVs for those with an MAF of ≤0.01% improved from 95% to 100% for FH, 42.11% to 85.19% for TH, and 18.24% to 72.22% for MODY after adopting our rare variant quality-control procedure and experimental verification. Adopting our quality-control procedure, SNP arrays can adequately detect variants with MAF values ranging 0.01%∼0.1%. For variants with MAF values of ≤0.01%, experimental validation is needed unless sequencing data from a homogeneous population of >10,000 are available. The results demonstrated our procedure could perform correct genotype calling of rare variants. It provides a solution of pathogenic variant detection through SNP array. The approach brings tremendous promise for implementing precision medicine in medical practice.

摘要

单核苷酸多态性(SNP)阵列是大规模筛查中对遗传变异进行基因分型的理想技术。然而,由于噪声信号和批次效应,使用SNP阵列检测罕见变异(次要等位基因频率(MAF)<1%)仍然是一项挑战。临床应用需要一种提高基因分型质量的方法。我们开发了一种针对罕见变异的质量控制程序,该程序整合了不同的算法、过滤器和实验,以提高变异检测的准确性。利用来自TWB 2.0定制Axiom阵列的数据,我们采用了先进的标准化调整来防止因簇分裂导致的错误检测,并采用了罕见杂合子调整来减少罕见变异中的错误检测。将阵列数据的等位基因频率一致性与台湾人群测序数据集的等位基因频率一致性进行比较。最后,利用基因分型结果检测家族性高胆固醇血症(FH)、血栓形成倾向(TH)和青年发病的成年型糖尿病(MODY),以评估疾病筛查中的性能。所有杂合子检测结果均通过桑格测序或定量PCR进行验证。估计每个步骤的阳性预测值(PPV)以评估我们程序的性能。我们分析了来自43433名个体的SNP阵列数据,检测了267247个罕见变异。先进的标准化和罕见杂合子调整方法调整了168134个变异(96.49%)的基因分型检测。我们进一步去除了3916个在SNP阵列和测序数据之间MAF不一致的探针集。检测MAF为0.01%<MAF≤1%的致病变异的PPV超过99.37%。在采用我们的罕见变异质量控制程序和实验验证后,MAF≤0.01%的变异的PPV在FH中从95%提高到100%,在TH中从42.11%提高到85.19%,在MODY中从18.24%提高到72.22%。采用我们的质量控制程序,SNP阵列可以充分检测MAF值在0.01%至0.1%之间的变异。对于MAF值≤0.01%的变异,除非有来自超过10000个同质人群的测序数据,否则需要进行实验验证。结果表明我们的程序可以对罕见变异进行正确的基因分型检测。它提供了一种通过SNP阵列检测致病变异的解决方案。该方法为在医疗实践中实施精准医学带来了巨大希望。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51cc/8577504/e817d175f47d/fgene-12-736390-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51cc/8577504/d6481e516c09/fgene-12-736390-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51cc/8577504/13e2551219a6/fgene-12-736390-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51cc/8577504/e817d175f47d/fgene-12-736390-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51cc/8577504/d6481e516c09/fgene-12-736390-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51cc/8577504/13e2551219a6/fgene-12-736390-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/51cc/8577504/e817d175f47d/fgene-12-736390-g003.jpg

相似文献

1
A Novel Quality-Control Procedure to Improve the Accuracy of Rare Variant Calling in SNP Arrays.一种提高SNP阵列中罕见变异检测准确性的新型质量控制程序。
Front Genet. 2021 Oct 26;12:736390. doi: 10.3389/fgene.2021.736390. eCollection 2021.
2
Novel genotyping algorithms for rare variants significantly improve the accuracy of Applied Biosystems™ Axiom™ array genotyping calls: Retrospective evaluation of UK Biobank array data.新型稀有变异基因分型算法可显著提高 Applied Biosystems™ Axiom™ 基因芯片基因分型结果的准确性:英国生物库基因芯片数据的回顾性评估。
PLoS One. 2022 Nov 17;17(11):e0277680. doi: 10.1371/journal.pone.0277680. eCollection 2022.
3
Use of SNP chips to detect rare pathogenic variants: retrospective, population based diagnostic evaluation.使用 SNP 芯片检测罕见的致病性变异:回顾性、基于人群的诊断评估。
BMJ. 2021 Feb 15;372:n214. doi: 10.1136/bmj.n214.
4
A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays.一种用于Affymetrix SNP微阵列的多阵列多SNP基因分型算法。
Bioinformatics. 2007 Jun 15;23(12):1459-67. doi: 10.1093/bioinformatics/btm131. Epub 2007 Apr 25.
5
Integration of Infinium and Axiom SNP array data in the outcrossing species Malus × domestica and causes for seemingly incompatible calls.在杂交物种苹果(Malus × domestica)中整合Infinium和Axiom SNP芯片数据以及看似不兼容调用的原因。
BMC Genomics. 2021 Apr 7;22(1):246. doi: 10.1186/s12864-021-07565-7.
6
A powerful tool for genome analysis in maize: development and evaluation of the high density 600 k SNP genotyping array.玉米基因组分析的强大工具:高密度600k SNP基因分型芯片的开发与评估
BMC Genomics. 2014 Sep 29;15(1):823. doi: 10.1186/1471-2164-15-823.
7
A unified approach for allele frequency estimation, SNP detection and association studies based on pooled sequencing data using EM algorithms.基于 EM 算法的基于测序数据的等位基因频率估计、SNP 检测和关联研究的统一方法。
BMC Genomics. 2013;14 Suppl 1(Suppl 1):S1. doi: 10.1186/1471-2164-14-S1-S1. Epub 2013 Jan 21.
8
Investigation of rare and low-frequency variants using high-throughput sequencing with pooled DNA samples.使用高通量测序对混合 DNA 样本进行罕见和低频变异的研究。
Sci Rep. 2016 Sep 16;6:33256. doi: 10.1038/srep33256.
9
Identification of missing variants by combining multiple analytic pipelines.通过结合多个分析管道识别缺失的变异。
BMC Bioinformatics. 2018 Apr 16;19(1):139. doi: 10.1186/s12859-018-2151-0.
10
iCall: a genotype-calling algorithm for rare, low-frequency and common variants on the Illumina exome array.iCall:一种用于 Illumina 外显子组阵列上罕见、低频和常见变异的基因型调用算法。
Bioinformatics. 2014 Jun 15;30(12):1714-20. doi: 10.1093/bioinformatics/btu107. Epub 2014 Feb 23.

引用本文的文献

1
Exome sequencing of UK birth cohorts.英国出生队列的外显子组测序。
Wellcome Open Res. 2024 Dec 5;9:390. doi: 10.12688/wellcomeopenres.22697.2. eCollection 2024.
2
Characteristics of Cancer in Subjects Carrying Lynch Syndrome-Associated Gene Variants in Taiwanese Population: A Hospital-Based Study in Taiwan.台湾人群中携带林奇综合征相关基因变异个体的癌症特征:台湾一项基于医院的研究
Cancers (Basel). 2024 Oct 31;16(21):3682. doi: 10.3390/cancers16213682.
3
The characterization and comorbidities of heterozygous Bardet-Biedl syndrome carriers.

本文引用的文献

1
Analysis of rare genetic variation underlying cardiometabolic diseases and traits among 200,000 individuals in the UK Biobank.在英国生物银行的 20 万名个体中分析心血管代谢疾病和特征的罕见遗传变异。
Nat Genet. 2022 Mar;54(3):240-250. doi: 10.1038/s41588-021-01011-w. Epub 2022 Feb 17.
2
Use of SNP chips to detect rare pathogenic variants: retrospective, population based diagnostic evaluation.使用单核苷酸多态性芯片检测罕见致病变异:基于人群的回顾性诊断评估
BMJ. 2021 Mar 22;372:n792. doi: 10.1136/bmj.n792.
3
Genetic characteristics and epidemiology of inherited retinal degeneration in Taiwan.
杂合子型巴德-比埃尔综合征携带者的特征及共病情况。
Int J Med Sci. 2024 Feb 25;21(5):784-794. doi: 10.7150/ijms.92766. eCollection 2024.
4
Higher Waist Hip Ratio Genetic Risk Score Is Associated with Reduced Weight Loss in Patients with Severe Obesity Completing a Meal Replacement Programme.较高的腰臀比遗传风险评分与完成代餐计划的重度肥胖患者体重减轻减少有关。
J Pers Med. 2022 Nov 9;12(11):1881. doi: 10.3390/jpm12111881.
5
Editorial: Current Status and Future Challenges of Biobank Data Analysis.社论:生物样本库数据分析的现状与未来挑战
Front Genet. 2022 Apr 14;13:882611. doi: 10.3389/fgene.2022.882611. eCollection 2022.
台湾遗传性视网膜变性的遗传特征与流行病学
NPJ Genom Med. 2021 Feb 19;6(1):16. doi: 10.1038/s41525-021-00180-1.
4
Monogenic Diabetes: From Genetic Insights to Population-Based Precision in Care. Reflections From a Editors' Expert Forum.单基因糖尿病:从遗传认识到基于人群的精准医疗。编辑专家论坛的思考。
Diabetes Care. 2020 Dec;43(12):3117-3128. doi: 10.2337/dci20-0065.
5
The Multifaceted Role of Epoxide Hydrolases in Human Health and Disease.环氧水解酶在人类健康和疾病中的多方面作用。
Int J Mol Sci. 2020 Dec 22;22(1):13. doi: 10.3390/ijms22010013.
6
GeneBreaker: Variant simulation to improve the diagnosis of Mendelian rare genetic diseases.基因破解器:变体模拟以提高孟德尔罕见遗传病的诊断。
Hum Mutat. 2021 Apr;42(4):346-358. doi: 10.1002/humu.24163. Epub 2021 Feb 10.
7
A population scale analysis of rare SNCA variation in the UK Biobank.在英国生物银行进行的罕见 SNCA 变异的人群规模分析。
Neurobiol Dis. 2021 Jan;148:105182. doi: 10.1016/j.nbd.2020.105182. Epub 2020 Dec 8.
8
Genetics of Familial Hypercholesterolemia: New Insights.家族性高胆固醇血症的遗传学:新见解
Front Genet. 2020 Oct 7;11:574474. doi: 10.3389/fgene.2020.574474. eCollection 2020.
9
Unique roles of rare variants in the genetics of complex diseases in humans.人类复杂疾病遗传学中罕见变异的独特作用。
J Hum Genet. 2021 Jan;66(1):11-23. doi: 10.1038/s10038-020-00845-2. Epub 2020 Sep 18.
10
Association of Rare Pathogenic DNA Variants for Familial Hypercholesterolemia, Hereditary Breast and Ovarian Cancer Syndrome, and Lynch Syndrome With Disease Risk in Adults According to Family History.罕见致病性 DNA 变异与家族性高胆固醇血症、遗传性乳腺癌和卵巢癌综合征、林奇综合征的关联,以及根据家族史判断这些变异与成人疾病风险的关系。
JAMA Netw Open. 2020 Apr 1;3(4):e203959. doi: 10.1001/jamanetworkopen.2020.3959.