Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA.
Illumina Inc, San Diego, CA, USA.
BMC Med Genomics. 2024 Oct 24;17(1):255. doi: 10.1186/s12920-024-02024-0.
The abundance of Lp(a) protein holds significant implications for the risk of cardiovascular disease (CVD), which is directly impacted by the copy number (CN) of KIV-2, a 5.5 kbp sub-region. KIV-2 is highly polymorphic in the population and accurate analysis is challenging. In this study, we present the DRAGEN KIV-2 CN caller, which utilizes short reads. Data across 166 WGS show that the caller has high accuracy, compared to optical mapping and can further phase approximately 50% of the samples. We compared KIV-2 CN numbers to 24 previously postulated KIV-2 relevant SNVs, revealing that many are ineffective predictors of KIV-2 copy number. Population studies, including USA-based cohorts, showed distinct KIV-2 CN, distributions for European-, African-, and Hispanic-American populations and further underscored the limitations of SNV predictors. We demonstrate that the CN estimates correlate significantly with the available Lp(a) protein levels and that phasing is highly important.
载脂蛋白(a)蛋白的丰度对心血管疾病(CVD)的风险具有重要意义,而载脂蛋白(a)蛋白的拷贝数(CN)又直接受到 KIV-2 的影响,KIV-2 是一个 5.5kbp 的亚区。KIV-2 在人群中高度多态性,准确分析具有挑战性。在这项研究中,我们提出了 DRAGEN KIV-2 CN 调用器,它利用短读长。来自 166 个 WGS 的数据表明,与光学作图相比,该调用器具有很高的准确性,并且可以进一步对大约 50%的样本进行相位分析。我们将 KIV-2 CN 数量与 24 个先前假设的 KIV-2 相关 SNV 进行了比较,结果表明,许多 SNV 并不能有效预测 KIV-2 的拷贝数。包括美国队列在内的人群研究表明,欧洲人、非洲人和西班牙裔美国人的 KIV-2 CN 分布明显不同,进一步强调了 SNV 预测的局限性。我们证明了 CN 估计与可用的载脂蛋白(a)蛋白水平显著相关,并且相位分析非常重要。
NAR Genom Bioinform. 2025-5-30
Genome Res. 2025-4-14
ArXiv. 2024-12-18
Nat Biotechnol. 2024-10-25
medRxiv. 2024-3-18
J Am Coll Cardiol. 2021-6-15