Mukhtar Ghazel, Shovlin Claire L
National Heart and Lung Institute Imperial College London London UK.
Imperial College School of Medicine London UK.
EJHaem. 2023 Jul 3;4(3):602-611. doi: 10.1002/jha2.746. eCollection 2023 Aug.
Hereditary haemorrhagic telangiectasia (HHT) can result in challenging anaemia and thrombosis phenotypes. Clinical presentations of HHT vary for relatives with identical casual mutations, suggesting other factors may modify severity. To examine objectively, we developed unsupervised machine learning algorithms to test whether haematological data at presentation could be categorised into sub-groupings and fitted to known biological factors. With ethical approval, we examined 10 complete blood count (CBC) variables, four iron index variables, four coagulation variables and eight iron/coagulation indices combined from 336 genotyped HHT patients (40% male, 60% female, 86.5% not using iron supplementation) at a single centre. T-SNE unsupervised, dimension reduction, machine learning algorithms assigned each high-dimensional datapoint to a location in a two-dimensional plane. k-Means clustering algorithms grouped into profiles, enabling visualisation and inter-profile comparisons of patients' clinical and genetic features. The unsupervised machine learning algorithms using t-SNE and k-Means identified two distinct CBC profiles, two iron profiles, four clotting profiles and three combined profiles. Validating the methodology, profiles for CBC or iron indices fitted expected patterns for haemorrhage. Distinct coagulation profiles displayed no association with age, sex, C-reactive protein, pulmonary arteriovenous malformations (AVMs), / genotype or epistaxis severity. The most distinct profiles were from t-SNE/k-Means analyses of combined iron-coagulation indices and mapped to three risk states - for venous thromboembolism in HHT; for ischaemic stroke attributed to paradoxical emboli through pulmonary AVMs in HHT; and for cerebral abscess attributed to odontogenic bacteremias in immunocompetent HHT patients with right-to-left shunting through pulmonary AVMs. In conclusion, unsupervised machine learning algorithms categorise HHT haematological indices into distinct, clinically relevant profiles which are independent of age, sex or HHT genotype. Further evaluation may inform prophylaxis and management for HHT patients' haemorrhagic and thrombotic phenotypes.
遗传性出血性毛细血管扩张症(HHT)可导致具有挑战性的贫血和血栓形成表型。HHT的临床表现因具有相同致病突变的亲属而异,这表明其他因素可能会改变疾病的严重程度。为了进行客观研究,我们开发了无监督机器学习算法,以测试就诊时的血液学数据是否可以分类为亚组,并与已知的生物学因素相匹配。在获得伦理批准后,我们在一个中心检查了336例基因分型的HHT患者(40%为男性,60%为女性,86.5%未使用铁补充剂)的10项全血细胞计数(CBC)变量、4项铁指标变量、4项凝血变量以及8项铁/凝血指标。T-SNE无监督降维机器学习算法将每个高维数据点分配到二维平面中的一个位置。k均值聚类算法将数据分组为不同的特征,从而能够对患者的临床和遗传特征进行可视化和特征间比较。使用t-SNE和k均值的无监督机器学习算法识别出两种不同的CBC特征、两种铁特征、四种凝血特征和三种综合特征。对该方法进行验证时,CBC或铁指标的特征符合出血的预期模式。不同的凝血特征与年龄、性别、C反应蛋白、肺动静脉畸形(AVM)、基因型或鼻出血严重程度无关。最明显的特征来自铁-凝血综合指标的t-SNE/k均值分析,并映射到三种风险状态——HHT患者发生静脉血栓栓塞;HHT患者因通过肺AVM的反常栓塞导致缺血性中风;免疫功能正常的HHT患者因通过肺AVM的右向左分流导致牙源性菌血症引起脑脓肿。总之,无监督机器学习算法将HHT血液学指标分类为不同的、与临床相关的特征,这些特征独立于年龄、性别或HHT基因型。进一步评估可能为HHT患者出血和血栓形成表型的预防和管理提供依据。