Suppr超能文献

人类全外显子组基因型数据用于阿尔茨海默病。

Human whole-exome genotype data for Alzheimer's disease.

机构信息

Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

Nat Commun. 2024 Jan 23;15(1):684. doi: 10.1038/s41467-024-44781-7.

Abstract

The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer's Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD > 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community.

摘要

全外显子组测序 (WES) 数据生成方法的异质性对联合分析提出了挑战。在这里,我们提出了一种生物信息学策略,用于联合调用在阿尔茨海默病测序项目中横跨 9 项研究收集的 20,504 个 WES 样本,这些样本使用 10 种捕获试剂盒在 14 个测序中心进行了测序。联合基因型调用变体调用格式 (VCF) 文件仅包含捕获试剂盒联合区域内的位置。然后,专门处理 VCF 以解决由于使用来自不同研究的不同捕获试剂盒而产生的批次效应。我们确定了 820 万个常染色体变体。96.82%的变体是高质量的,位于 28579 个 Ensembl 转录本中。41%的变体是内含子的,1.8%的变体是 CADD > 30,表明它们具有高度预测的致病性。在这里,我们展示了我们的新策略可以从处理这些不同生成的 WES 样本中生成高质量的数据。改进的能力来结合在不同批次中测序的数据使整个基因组学研究社区受益。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/580c/10805795/4c948e67adc2/41467_2024_44781_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验