Departamento das Ciências da Computação, Instituto de Ciências Exatas, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil.
Colégio Técnico, Universidade Federal de Minas Gerais, Belo Horizonte, MG, Brazil.
J Endocrinol Invest. 2022 Mar;45(3):497-505. doi: 10.1007/s40618-021-01672-8. Epub 2021 Sep 15.
Polycystic Ovary Syndrome (PCOS) is the most frequent endocrinopathy in women of reproductive age. Machine learning (ML) is the area of artificial intelligence with a focus on predictive computing algorithms. We aimed to define the most relevant clinical and laboratory variables related to PCOS diagnosis, and to stratify patients into different phenotypic groups (clusters) using ML algorithms.
Variables from a database comparing 72 patients with PCOS and 73 healthy women were included. The BorutaShap method, followed by the Random Forest algorithm, was applied to prediction and clustering of PCOS.
Among the 58 variables investigated, the algorithm selected in decreasing order of importance: lipid accumulation product (LAP); abdominal circumference; thrombin activatable fibrinolysis inhibitor (TAFI) levels; body mass index (BMI); C-reactive protein (CRP), high-density lipoprotein cholesterol (HDL-c), follicle-stimulating hormone (FSH) and insulin levels; HOMA-IR value; age; prolactin, 17-OH progesterone and triglycerides levels; and family history of diabetes mellitus in first-degree relative as the variables associated to PCOS diagnosis. The combined use of these variables by the algorithm showed an accuracy of 86% and area under the ROC curve of 97%. Next, PCOS patients were gathered into two clusters in the first, the patients had higher BMI, abdominal circumference, LAP and HOMA-IR index, as well as CRP and insulin levels compared to the other cluster.
The developed algorithm could be applied to select more important clinical and biochemical variables related to PCOS and to classify into phenotypically different clusters. These results could guide more personalized and effective approaches to the treatment of PCOS.
多囊卵巢综合征(PCOS)是育龄妇女最常见的内分泌疾病。机器学习(ML)是人工智能的一个领域,专注于预测计算算法。我们旨在确定与 PCOS 诊断相关的最相关的临床和实验室变量,并使用 ML 算法将患者分层为不同的表型组(聚类)。
纳入了比较 72 例 PCOS 患者和 73 例健康女性的数据库中的变量。采用 BorutaShap 方法,然后是随机森林算法,用于 PCOS 的预测和聚类。
在所研究的 58 个变量中,算法按重要性降序选择:脂质蓄积产物(LAP);腹围;凝血酶激活的纤溶抑制物(TAFI)水平;体重指数(BMI);C 反应蛋白(CRP)、高密度脂蛋白胆固醇(HDL-c)、卵泡刺激素(FSH)和胰岛素水平;HOMA-IR 值;年龄;催乳素、17-羟孕酮和甘油三酯水平;一级亲属的糖尿病家族史是与 PCOS 诊断相关的变量。该算法联合使用这些变量的准确率为 86%,ROC 曲线下面积为 97%。接下来,PCOS 患者被分为两个聚类,第一个聚类的患者 BMI、腹围、LAP 和 HOMA-IR 指数以及 CRP 和胰岛素水平较高。
开发的算法可用于选择与 PCOS 相关的更重要的临床和生化变量,并对其进行表型上不同的聚类。这些结果可以指导更个性化和有效的 PCOS 治疗方法。