Genomic Medicine Institute, Geisinger, Danville, PA.
Kidney Research Institute, Geisinger, Danville, PA.
Am J Obstet Gynecol. 2020 Oct;223(4):559.e1-559.e21. doi: 10.1016/j.ajog.2020.04.004. Epub 2020 Apr 11.
Polycystic ovary syndrome is the most common endocrine disorder affecting women of reproductive age. A number of criteria have been developed for clinical diagnosis of polycystic ovary syndrome, with the Rotterdam criteria being the most inclusive. Evidence suggests that polycystic ovary syndrome is significantly heritable, and previous studies have identified genetic variants associated with polycystic ovary syndrome diagnosed using different criteria. The widely adopted electronic health record system provides an opportunity to identify patients with polycystic ovary syndrome using the Rotterdam criteria for genetic studies.
To identify novel associated genetic variants under the same phenotype definition, we extracted polycystic ovary syndrome cases and unaffected controls based on the Rotterdam criteria from the electronic health records and performed a discovery-validation genome-wide association study.
We developed a polycystic ovary syndrome phenotyping algorithm on the basis of the Rotterdam criteria and applied it to 3 electronic health record-linked biobanks to identify cases and controls for genetic study. In the discovery phase, we performed an individual genome-wide association study using the Geisinger MyCode and the Electronic Medical Records and Genomics cohorts, which were then meta-analyzed. We attempted validation of the significant association loci (P<1×10) in the BioVU cohort. All association analyses used logistic regression, assuming an additive genetic model, and adjusted for principal components to control for population stratification. An inverse-variance fixed-effect model was adopted for meta-analysis. In addition, we examined the top variants to evaluate their associations with each criterion in the phenotyping algorithm. We used the STRING database to characterize protein-protein interaction network.
Using the same algorithm based on the Rotterdam criteria, we identified 2995 patients with polycystic ovary syndrome and 53,599 population controls in total (2742 cases and 51,438 controls from the discovery phase; 253 cases and 2161 controls in the validation phase). We identified 1 novel genome-wide significant variant rs17186366 (odds ratio [OR]=1.37 [1.23, 1.54], P=2.8×10) located near SOD2. In addition, 2 loci with suggestive association were also identified: rs113168128 (OR=1.72 [1.42, 2.10], P=5.2×10), an intronic variant of ERBB4 that is independent from the previously published variants, and rs144248326 (OR=2.13 [1.52, 2.86], P=8.45×10), a novel intronic variant in WWTR1. In the further association tests of the top 3 single-nucleotide polymorphisms with each criterion in the polycystic ovary syndrome algorithm, we found that rs17186366 (SOD2) was associated with polycystic ovaries and hyperandrogenism, whereas rs11316812 (ERBB4) and rs144248326 (WWTR1) were mainly associated with oligomenorrhea or infertility. We also validated the previously reported association with DENND1A1. Using the STRING database to characterize protein-protein interactions, we found both ERBB4 and WWTR1 can interact with YAP1, which has been previously associated with polycystic ovary syndrome.
Through a discovery-validation genome-wide association study on polycystic ovary syndrome identified from electronic health records using an algorithm based on Rotterdam criteria, we identified and validated a novel genome-wide significant association with a variant near SOD2. We also identified a novel independent variant within ERBB4 and a suggestive association with WWTR1. With previously identified polycystic ovary syndrome gene YAP1, the ERBB4-YAP1-WWTR1 network suggests involvement of the epidermal growth factor receptor and the Hippo pathway in the multifactorial etiology of polycystic ovary syndrome.
多囊卵巢综合征是影响育龄期妇女的最常见内分泌疾病之一。已经制定了多种临床诊断多囊卵巢综合征的标准,其中最全面的是鹿特丹标准。有证据表明,多囊卵巢综合征具有显著的遗传性,先前的研究已经确定了与使用不同标准诊断的多囊卵巢综合征相关的遗传变异。广泛采用的电子健康记录系统为使用鹿特丹标准识别多囊卵巢综合征患者提供了进行遗传研究的机会。
为了在同一表型定义下识别新的相关遗传变异,我们根据鹿特丹标准从电子健康记录中提取多囊卵巢综合征病例和无影响对照,并进行了发现-验证全基因组关联研究。
我们基于鹿特丹标准开发了多囊卵巢综合征表型算法,并将其应用于 3 个电子健康记录链接的生物库,以确定遗传研究的病例和对照。在发现阶段,我们使用 Geisinger MyCode 和电子病历和基因组学队列进行了个体全基因组关联研究,然后进行了荟萃分析。我们试图在 BioVU 队列中验证显著关联位点(P<1×10)。所有关联分析均采用逻辑回归,假设加性遗传模型,并调整主成分以控制群体分层。采用逆方差固定效应模型进行荟萃分析。此外,我们还检查了顶级变体,以评估它们与表型算法中每个标准的关联。我们使用 STRING 数据库来描述蛋白质-蛋白质相互作用网络。
使用基于鹿特丹标准的相同算法,我们总共确定了 2995 例多囊卵巢综合征患者和 53599 例人群对照(2742 例病例和 51438 例对照来自发现阶段;253 例病例和 2161 例对照来自验证阶段)。我们发现了 1 个新的全基因组显著变异 rs17186366(优势比[OR]=1.37[1.23,1.54],P=2.8×10),位于 SOD2 附近。此外,还确定了 2 个具有提示性关联的位点:rs113168128(OR=1.72[1.42,2.10],P=5.2×10),是 ERBB4 的内含子变异,与先前发表的变异无关,以及 rs144248326(OR=2.13[1.52,2.86],P=8.45×10),是 WWTR1 的新内含子变异。在对多囊卵巢综合征算法中每个标准的前 3 个单核苷酸多态性的进一步关联测试中,我们发现 rs17186366(SOD2)与多囊卵巢和高雄激素血症相关,而 rs11316812(ERBB4)和 rs144248326(WWTR1)主要与稀发排卵或不孕相关。我们还验证了先前与 DENND1A1 相关的关联。使用 STRING 数据库来描述蛋白质-蛋白质相互作用,我们发现 ERBB4 和 WWTR1 都可以与 YAP1 相互作用,YAP1 先前与多囊卵巢综合征相关。
通过使用基于鹿特丹标准的算法从电子健康记录中发现-验证多囊卵巢综合征的全基因组关联研究,我们鉴定并验证了与 SOD2 附近的一个新的全基因组显著关联。我们还发现了 ERBB4 内的一个新的独立变异和 WWTR1 的提示性关联。与先前发现的多囊卵巢综合征基因 YAP1 一起,ERBB4-YAP1-WWTR1 网络表明表皮生长因子受体和 Hippo 通路参与了多囊卵巢综合征的多因素病因。