Suppr超能文献

人类全外显子组基因型数据用于阿尔茨海默病。

Human whole-exome genotype data for Alzheimer's disease.

机构信息

Penn Neurodegeneration Genomics Center, Department of Pathology and Laboratory Medicine, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Department of Biostatistics, Epidemiology, and Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

出版信息

Nat Commun. 2024 Jan 23;15(1):684. doi: 10.1038/s41467-024-44781-7.

Abstract

The heterogeneity of the whole-exome sequencing (WES) data generation methods present a challenge to a joint analysis. Here we present a bioinformatics strategy for joint-calling 20,504 WES samples collected across nine studies and sequenced using ten capture kits in fourteen sequencing centers in the Alzheimer's Disease Sequencing Project. The joint-genotype called variant-called format (VCF) file contains only positions within the union of capture kits. The VCF was then processed specifically to account for the batch effects arising from the use of different capture kits from different studies. We identified 8.2 million autosomal variants. 96.82% of the variants are high-quality, and are located in 28,579 Ensembl transcripts. 41% of the variants are intronic and 1.8% of the variants are with CADD > 30, indicating they are of high predicted pathogenicity. Here we show our new strategy can generate high-quality data from processing these diversely generated WES samples. The improved ability to combine data sequenced in different batches benefits the whole genomics research community.

摘要

全外显子组测序 (WES) 数据生成方法的异质性对联合分析提出了挑战。在这里,我们提出了一种生物信息学策略,用于联合调用在阿尔茨海默病测序项目中横跨 9 项研究收集的 20,504 个 WES 样本,这些样本使用 10 种捕获试剂盒在 14 个测序中心进行了测序。联合基因型调用变体调用格式 (VCF) 文件仅包含捕获试剂盒联合区域内的位置。然后,专门处理 VCF 以解决由于使用来自不同研究的不同捕获试剂盒而产生的批次效应。我们确定了 820 万个常染色体变体。96.82%的变体是高质量的,位于 28579 个 Ensembl 转录本中。41%的变体是内含子的,1.8%的变体是 CADD > 30,表明它们具有高度预测的致病性。在这里,我们展示了我们的新策略可以从处理这些不同生成的 WES 样本中生成高质量的数据。改进的能力来结合在不同批次中测序的数据使整个基因组学研究社区受益。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/580c/10805795/4c948e67adc2/41467_2024_44781_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验