• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

台中荣民总医院基因生物银行中多种多重填补算法的比较及全基因组测序验证

Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank.

作者信息

Liu Ting-Yuan, Lin Chih-Fan, Wu Hsing-Tsung, Wu Ya-Lun, Chen Yu-Chia, Liao Chi-Chou, Chou Yu-Pao, Chao Dysan, Chang Ya-Sian, Lu Hsing-Fang, Chang Jan-Gowth, Hsu Kai-Cheng, Tsai Fuu-Jen

机构信息

Center for Precision Medicine, China Medical University Hospital, Taichung, 40447, Taiwan.

Artificial Intelligence Center for Medical Diagnosis, China Medical University Hospital, Taichung, 40447, Taiwan.

出版信息

Biomedicine (Taipei). 2021 Dec 1;11(4):57-65. doi: 10.37796/2211-8039.1302. eCollection 2021.

DOI:10.37796/2211-8039.1302
PMID:35223420
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8823485/
Abstract

A genome-wide association study (GWAS) can be conducted to systematically analyze the contributions of genetic factors to a wide variety of complex diseases. Nevertheless, existing GWASs have provided highly ethnic specific data. Accordingly, to provide data specific to Taiwan, we established a large-scale genetic database in a single medical institution at the China Medical University Hospital. With current technological limitations, microarray analysis can detect only a limited number of single-nucleotide polymorphisms (SNPs) with a minor allele frequency of >1%. Nevertheless, imputation represents a useful alternative means of expanding data. In this study, we compared four imputation algorithms in terms of various metrics. We observed that among the compared algorithms, Beagle5.2 achieved the fastest calculation speed, smallest storage space, highest specificity, and highest number of high-quality variants. We obtained 15,277,414 high-quality variants in 175,871 people by using Beagle5.2. In our internal verification process, Beagle5.2 exhibited an accuracy rate of up to 98.75%. We also conducted external verification. Our imputed variants had a 79.91% mapping rate and 90.41% accuracy. These results will be combined with clinical data in future research. We have made the results available for researchers to use in formulating imputation algorithms, in addition to establishing a complete SNP database for GWAS and PRS researchers. We believe that these data can help improve overall medical capabilities, particularly precision medicine, in Taiwan.

摘要

全基因组关联研究(GWAS)可用于系统分析遗传因素对多种复杂疾病的影响。然而,现有的GWAS提供的数据具有高度的种族特异性。因此,为了提供台湾地区特有的数据,我们在中国医科大学附设医院的单一医疗机构中建立了一个大规模遗传数据库。由于目前的技术限制,微阵列分析只能检测少数次要等位基因频率大于1%的单核苷酸多态性(SNP)。然而,插补是一种扩展数据的有用替代方法。在本研究中,我们根据各种指标比较了四种插补算法。我们观察到,在比较的算法中,Beagle5.2的计算速度最快、存储空间最小、特异性最高且高质量变异数量最多。通过使用Beagle5.2,我们在175,871人中获得了15,277,414个高质量变异。在我们的内部验证过程中,Beagle5.2的准确率高达98.75%。我们还进行了外部验证。我们插补的变异映射率为79.91%,准确率为90.41%。这些结果将在未来的研究中与临床数据相结合。除了为GWAS和PRS研究人员建立一个完整的SNP数据库外,我们还将结果提供给研究人员用于制定插补算法。我们相信这些数据有助于提高台湾地区的整体医疗能力,特别是精准医学能力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/f08ff4f08fb2/bmed-11-04-057-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/85d8dbfd9235/bmed-11-04-057-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/6d99e8f8379c/bmed-11-04-057-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/623f05e1eedd/bmed-11-04-057-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/d948090473f8/bmed-11-04-057-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/f08ff4f08fb2/bmed-11-04-057-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/85d8dbfd9235/bmed-11-04-057-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/6d99e8f8379c/bmed-11-04-057-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/623f05e1eedd/bmed-11-04-057-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/d948090473f8/bmed-11-04-057-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e0d0/8823485/f08ff4f08fb2/bmed-11-04-057-g005.jpg

相似文献

1
Comparison of multiple imputation algorithms and verification using whole-genome sequencing in the CMUH genetic biobank.台中荣民总医院基因生物银行中多种多重填补算法的比较及全基因组测序验证
Biomedicine (Taipei). 2021 Dec 1;11(4):57-65. doi: 10.37796/2211-8039.1302. eCollection 2021.
2
Accuracy of genome-wide imputation of untyped markers and impacts on statistical power for association studies.未分型标记的全基因组推断准确性及其对关联研究统计效能的影响。
BMC Genet. 2009 Jun 16;10:27. doi: 10.1186/1471-2156-10-27.
3
Multi-ethnic Imputation System (MI-System): A genotype imputation server for high-dimensional data.多民族基因分型系统(MI-System):用于高维数据的基因型基因分型服务器。
J Biomed Inform. 2023 Jul;143:104423. doi: 10.1016/j.jbi.2023.104423. Epub 2023 Jun 10.
4
Comparison among three variant callers and assessment of the accuracy of imputation from SNP array data to whole-genome sequence level in chicken.鸡中三种变异检测工具的比较以及从SNP芯片数据到全基因组序列水平的填充准确性评估。
BMC Genomics. 2015 Oct 21;16:824. doi: 10.1186/s12864-015-2059-2.
5
Genome-wide causal mediation analysis identifies genetic loci associated with uterine fibroids mediated by age at menarche.全基因组因果中介分析确定了与初潮年龄中介相关的与子宫肌瘤相关的遗传基因座。
Hum Reprod. 2022 Aug 25;37(9):2197-2212. doi: 10.1093/humrep/deac136.
6
The predictive capacity of polygenic risk scores for disease risk is only moderately influenced by imputation panels tailored to the target population.多基因风险评分对疾病风险的预测能力仅受到针对目标人群定制的 imputation 面板的适度影响。
Bioinformatics. 2024 Feb 1;40(2). doi: 10.1093/bioinformatics/btae036.
7
Comprehensive evaluation of imputation performance in African Americans.对非裔美国人插补性能的综合评估。
J Hum Genet. 2012 Jul;57(7):411-21. doi: 10.1038/jhg.2012.43. Epub 2012 May 31.
8
GWAS on Imputed Whole-Genome Resequencing From Genotyping-by-Sequencing Data for Farrowing Interval of Different Parities in Pigs.基于测序分型数据进行猪不同胎次产仔间隔的全基因组重测序推算的全基因组关联研究
Front Genet. 2019 Oct 18;10:1012. doi: 10.3389/fgene.2019.01012. eCollection 2019.
9
Comparison of Genotype Imputation for SNP Array and Low-Coverage Whole-Genome Sequencing Data.单核苷酸多态性(SNP)阵列与低覆盖度全基因组测序数据的基因型填充比较
Front Genet. 2022 Jan 3;12:704118. doi: 10.3389/fgene.2021.704118. eCollection 2021.
10
A high-resolution haplotype-resolved Reference panel constructed from the China Kadoorie Biobank Study.基于中国慢性病前瞻性研究构建的高分辨率单体型解析参考面板。
Nucleic Acids Res. 2023 Nov 27;51(21):11770-11782. doi: 10.1093/nar/gkad779.

引用本文的文献

1
Polygenic risk score as a tool to predict gestational weight gain and gestational diabetes among pregnant women in Taiwan.多基因风险评分作为预测台湾孕妇孕期体重增加和妊娠期糖尿病的工具。
J Hum Genet. 2025 Aug 21. doi: 10.1038/s10038-025-01360-y.
2
Pathway insights and predictive modeling for type 2 diabetes using polygenic risk scores.使用多基因风险评分对2型糖尿病进行通路洞察和预测建模。
Sci Rep. 2025 Aug 7;15(1):28956. doi: 10.1038/s41598-025-13391-8.
3
Polygenic risk scores of fasting insulin and insulin-related traits in a Taiwanese Han population.

本文引用的文献

1
Electronic medical record-based deep data cleaning and phenotyping improve the diagnostic validity and mortality assessment of infective endocarditis: medical big data initiative of CMUH.基于电子病历的深度数据清理与表型分析可提高感染性心内膜炎的诊断准确性及死亡率评估:中国医药大学附设医院的医学大数据计划
Biomedicine (Taipei). 2021 Sep 1;11(3):59-67. doi: 10.37796/2211-8039.1267. eCollection 2021.
2
Genetic factors of idiopathic central precocious puberty and their polygenic risk in early puberty.特发性中枢性性早熟的遗传因素及其在青春期早期的多基因风险。
Eur J Endocrinol. 2021 Aug 27;185(4):441-451. doi: 10.1530/EJE-21-0424.
3
台湾汉族人群中空腹胰岛素及胰岛素相关性状的多基因风险评分
Cell Biosci. 2025 Aug 5;15(1):115. doi: 10.1186/s13578-025-01454-2.
4
Diversity and longitudinal records: Genetic architecture of disease associations and polygenic risk in the Taiwanese Han population.多样性与纵向记录:台湾汉族人群疾病关联及多基因风险的遗传结构
Sci Adv. 2025 Jun 6;11(23):eadt0539. doi: 10.1126/sciadv.adt0539. Epub 2025 Jun 4.
5
Mini-review of clinical data service platforms in the era of artificial intelligence: A case study of the iHi data platform.人工智能时代临床数据服务平台的综述:以iHi数据平台为例
Biomedicine (Taipei). 2025 Mar 1;15(1):6-22. doi: 10.37796/2211-8039.1643. eCollection 2025.
6
Discovery and prioritization of genetic determinants of kidney function in 297,355 individuals from Taiwan and Japan.在来自台湾和日本的 297355 个人中发现和优先考虑肾功能的遗传决定因素。
Nat Commun. 2024 Oct 29;15(1):9317. doi: 10.1038/s41467-024-53516-7.
7
Developing a Polygenic Risk Score with Age and Sex to Identify High-Risk Myopia in Taiwan.结合年龄和性别制定多基因风险评分以识别台湾地区的高度近视风险人群
Biomedicines. 2024 Jul 20;12(7):1619. doi: 10.3390/biomedicines12071619.
8
Ethnic-specific genetic susceptibility loci for endometriosis in Taiwanese-Han population: a genome-wide association study.台湾汉族人群子宫内膜异位症的种族特异性遗传易感性位点:全基因组关联研究。
J Hum Genet. 2024 Nov;69(11):573-583. doi: 10.1038/s10038-024-01270-5. Epub 2024 Jul 9.
9
Genome-wide association study identifies novel susceptible loci and evaluation of polygenic risk score for chronic obstructive pulmonary disease in a Taiwanese population.全基因组关联研究鉴定了台湾人群慢性阻塞性肺疾病的新易感位点,并评估了多基因风险评分。
BMC Genomics. 2024 Jun 17;25(1):607. doi: 10.1186/s12864-024-10526-5.
10
Impact of polygenic risk score for triglyceride trajectory and diabetic complications in subjects with type 2 diabetes based on large electronic medical record data from Taiwan: a case control study.基于台湾大型电子病历数据的 2 型糖尿病患者甘油三酯轨迹和糖尿病并发症多基因风险评分的影响:病例对照研究。
J Endocrinol Invest. 2024 Dec;47(12):3101-3110. doi: 10.1007/s40618-024-02397-0. Epub 2024 May 25.
Genome-Wide Association Study and Identification of a Protective Missense Variant on Lipoprotein(a) Concentration: Protective Missense Variant on Lipoprotein(a) Concentration-Brief Report.
全基因组关联研究及脂蛋白(a)浓度保护性错义变异的鉴定:脂蛋白(a)浓度保护性错义变异的简要报告。
Arterioscler Thromb Vasc Biol. 2021 May 5;41(5):1792-1800. doi: 10.1161/ATVBAHA.120.315300. Epub 2021 Mar 18.
4
Massively Parallel Sequencing for Rare Genetic Disorders: Potential and Pitfalls.大规模平行测序在罕见遗传病中的应用:潜在价值与陷阱。
Front Endocrinol (Lausanne). 2021 Feb 19;11:628946. doi: 10.3389/fendo.2020.628946. eCollection 2020.
5
Twelve years of SAMtools and BCFtools.SAMtools 和 BCFtools 十二年。
Gigascience. 2021 Feb 16;10(2). doi: 10.1093/gigascience/giab008.
6
Genetic profiles of 103,106 individuals in the Taiwan Biobank provide insights into the health and history of Han Chinese.台湾生物银行中103106人的基因图谱为了解汉族人群的健康状况和历史提供了线索。
NPJ Genom Med. 2021 Feb 11;6(1):10. doi: 10.1038/s41525-021-00178-9.
7
Genome-Wide and Candidate Gene Association Analyses Identify a 14-SNP Combination for Hypertension in Patients With Type 2 Diabetes.全基因组和候选基因关联分析确定了2型糖尿病患者高血压的一个14个单核苷酸多态性组合。
Am J Hypertens. 2021 Jun 22;34(6):651-661. doi: 10.1093/ajh/hpaa203.
8
Genotype imputation using the Positional Burrows Wheeler Transform.基于位置的 Burrows-Wheeler 变换的基因型推断。
PLoS Genet. 2020 Nov 16;16(11):e1009049. doi: 10.1371/journal.pgen.1009049. eCollection 2020 Nov.
9
Evaluating genotype imputation pipeline for ultra-low coverage ancient genomes.评估超低覆盖度古基因组的基因型推断流程。
Sci Rep. 2020 Oct 29;10(1):18542. doi: 10.1038/s41598-020-75387-w.
10
Post-extraction dimensional changes: A systematic review and meta-analysis.拔牙后尺寸变化:系统评价和荟萃分析。
J Clin Periodontol. 2021 Jan;48(1):126-144. doi: 10.1111/jcpe.13390. Epub 2020 Nov 4.